You must Sign In to post a response.
  • Category: .NET

    Reading Information from a PDF File

    Morning Everyone

    I need help to read information from a PDF file.
    The PDF file has table there fore I would like to read the info per field.

    This is what I used

    public string ReadPdfFile(string fileName)
    StringBuilder text = new StringBuilder();

    if (File.Exists(fileName))
    PdfReader pdfReader = new PdfReader(fileName);

    for (int page = 1; page <= pdfReader.NumberOfPages; page++)
    ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
    //string strttt = pdfReader.AcroFields.Fields.
    string currentText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy);

    currentText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(currentText)));
    text.Append( currentText);
    return text.ToString();

    This code is read reading it per page therefore I still have to rearrange data. Can I read it per field please help.

    Thanx in advance
  • #769240
    Actually there is no table in PDF file, the table you see is made up of lines and text. If you mean AcroFields, then you can read information from it.
    Below example is based on a free PDF library named Free Spire.PDF. Hope it helps.
    //Load the PDF document
    PdfDocument document = new PdfDocument("Input.pdf");

    //Load the existing forms
    PdfFormWidget loadedForm = document.Form as PdfFormWidget;

    //Go through the forms
    for (int i = 0; i < loadedForm.FieldsWidget.List.Count; i++)
    PdfField field = loadedForm.FieldsWidget.List[i] as PdfField;
    //Fill textbox form field
    if (field is PdfTextBoxFieldWidget)
    PdfTextBoxFieldWidget textField = field as PdfTextBoxFieldWidget;
    //Get the field name and fill with content
    switch (textField.Name)
    case "fieldName":
    textField.Text = "text";
    //Save and close

  • Sign In to post your comments