A quick start .NET console project that shows how to extract text from a PDF document using the Syncfusion® PDF Library.
Framework and SDKs
- .NET SDK (version 5.0 or later)
IDEs
- Visual Studio 2019/ Visual Studio 2022
We will create a new .NET console application, add the Syncfusion® PDF library package, and write the code
//Get stream from an existing PDF document.
FileStream docStream = new FileStream("Input.pdf", FileMode.Open, FileAccess.Read);
//Load the PDF document.
PdfLoadedDocument loadedDocument = new PdfLoadedDocument(docStream);
//Load the first page.
PdfPageBase page = loadedDocument.Pages[0];
//Extract text from first page.
string extractedText = page.ExtractText();
//Save the text.
File.WriteAllText("Result.txt", extractedText);
//Close the document.
loadedDocument.Close(true);
We will create a new .NET console application, add the Syncfusion® PDF library package, and write the code
//Get stream from an existing PDF document.
FileStream docStream = new FileStream("Invoice.pdf", FileMode.Open, FileAccess.Read);
//Load the PDF document.
PdfLoadedDocument loadedDocument = new PdfLoadedDocument(docStream);
//Load first page.
PdfPageBase page = loadedDocument.Pages[0];
//Extract text from first page.
string extractedTexts = page.ExtractText(true);
//Save the text.
File.WriteAllText("data.txt", extractedTexts);
//Close the document.
loadedDocument.Close(true);
We will create a new .NET console application, add the Syncfusion® PDF library package, and write the code
//Get stream from an existing PDF document.
FileStream docStream = new FileStream("Data.pdf", FileMode.Open, FileAccess.Read);
//Load the PDF document.
PdfLoadedDocument loadedDocument = new PdfLoadedDocument(docStream);
string extractedText = string.Empty;
//Extract all the text from the PDF document pages.
foreach (PdfLoadedPage loadedPage in loadedDocument.Pages) {
extractedText += loadedPage.ExtractText();
}
//Save the text to file.
File.WriteAllText("data.txt", extractedText);
//Close the document.
loadedDocument.Close(true);
We will create a new .NET console application, add the Syncfusion® PDF library package, and write the code
//Get stream from an existing PDF document.
FileStream docStream = new FileStream("Invoice.pdf", FileMode.Open, FileAccess.Read);
//Load the PDF document.
PdfLoadedDocument loadedDocument = new PdfLoadedDocument(docStream);
//Get the first page of the loaded PDF document.
PdfPageBase page = loadedDocument.Pages[0];
//Create line collection.
var lineCollection = new TextLineCollection();
//Extract text from the first page.
page.ExtractText(out lineCollection);
RectangleF textBounds = new RectangleF(474.96198f, 161.62997f, 50.040073f, 9);
string invoiceNumber = "";
//Get the text provided in the bounds.
foreach (TextLine textLine in lineCollection.TextLine) {
foreach (TextWord word in textLine.WordCollection) {
if (textBounds==word.Bounds) {
invoiceNumber = word.Text;
break;
}
}
}
//Save the text to file.
File.WriteAllText("data.txt", invoiceNumber);
//Close the PDF document.
loadedDocument.Close(true);
- Download this project to a location in your disk.
- Open the solution file using Visual Studio.
- Rebuild the solution to install the required NuGet package.
- Run the application.
- Product page: Syncfusion® PDF Framework
- Documentation page: Syncfusion® .NET PDF library
- Online demo: Syncfusion® .NET PDF library - Online demos
- Blog: Syncfusion® .NET PDF library - Blog
- Knowledge Base: Syncfusion® .NET PDF library - Knowledge Base
- EBooks: Syncfusion® .NET PDF library - EBooks
- FAQ: Syncfusion® .NET PDF library - FAQ
- For any other queries, reach our Syncfusion® support team or post the queries through the community forums.
- Request new feature through Syncfusion® feedback portal.
This is a commercial product and requires a paid license for possession or use. Syncfusion’s licensed software, including this component, is subject to the terms and conditions of Syncfusion's EULA. You can purchase a licnense here or start a free 30-day trial here.
Founded in 2001 and headquartered in Research Triangle Park, N.C., Syncfusion® has more than 26,000+ customers and more than 1 million users, including large financial institutions, Fortune 500 companies, and global IT consultancies.
Today, we provide 1600+ components and frameworks for web (Blazor, ASP.NET Core, ASP.NET MVC, ASP.NET WebForms, JavaScript, Angular, React, Vue, and Flutter), mobile (Xamarin, Flutter, UWP, and JavaScript), and desktop development (WinForms, WPF, WinUI(Preview), Flutter and UWP). We provide ready-to-deploy enterprise software for dashboards, reports, data integration, and big data processing. Many customers have saved millions in licensing fees by deploying our software.