APAC CIO Outlook
  • Home
  • CXO Insights
  • CIO Views
  • Vendors
  • News
  • Conferences
  • Whitepapers
  • Newsletter
  • Awards
Apac
  • Agile

    Artificial Intelligence

    Aviation

    Bi and Analytics

    Big Data

    Blockchain

    Cloud

    Cyber Security

    Digital Infrastructure

    Digital Marketing

    Digital Transformation

    Digital Twin

    Drone

    Internet of Things

    Low Code No Code

    Networking

    Remote Work

    Singapore Startups

    Smart City

    Software Testing

    Startup

  • E-Commerce

    Education

    FinTech

    Healthcare

    Manufacturing

    Retail

    Travel and Hospitality

  • Dell

    Microsoft

    Salesforce

    SAP

  • Cognitive

    Compliance

    Contact Center

    Corporate Finance

    Data Center

    Data Integration

    Digital Asset Management

    Gamification

    HR Technology

    IT Service Management

    Managed Services

    Procurement

    RegTech

    Travel Retail

Menu
    • Software Testing
    • Microsoft
    • Procurement
    • Managed Services
    • Cyber Security
    • Gamification
    • Blockchain
    • CRM
    • E-Commerce
    • Low Code No Code
    • MORE
    #

    Apac CIO Outlook Weekly Brief

    ×

    Be first to read the latest tech news, Industry Leader's Insights, and CIO interviews of medium and large enterprises exclusively from Apac CIO Outlook

    Subscribe

    loading

    THANK YOU FOR SUBSCRIBING

    • Home
    • Software Testing
    Editor's Pick (1 - 4 of 8)
    left
    The Quest to Succeed in this Changing World

    Matthew Faries, CIO, AFG Australian Finance Group

    The Transforming Role of the CIO

    Jana Branham, Chief Information Officer, ACH Food Companies

    Consumerization of IT & What Enterprises Should Anticipate

    Denny Charlie, EVP & CIO, Soho Global Health

    Is Artificial Intelligence Truly Testable?

    Liming Zhu, Research Director, Csiro’s Data61

    Innovations Empowering Delivery Teams

    Daniel Molinard, Senior Manager Technology / Head Of Engineering, Tabcorp

    The Art of Evolution: Effective Agile Transformations

    Matthew Done, Head Of Development, Australia, Worldfirst

    How Visa is Revolutionizing the way Commercial Payments Travel the World

    Sam Hamilton, SVP Data Product Development, Visa

    Autonomous Driving, 5g, Precision Medicine And More: The Impact of High-Performance Computing and its Future in Cloud

    Sonia Blouin, Cloud HPC Sales Lead, Microsoft

    right

    Is the PDF the cul-de-sac of data?

    Martin Pickrodt, Chief Information Officer, Mesitis

    Tweet
    content-image

    Martin Pickrodt, Chief Information Officer, Mesitis

    The information age has given a new meaning to Francis Bacon’s “Knowledge is Power”. Information and data are becoming ever cheaper to create, store and transmit; data is everywhere and a fundament for business decisions, loan approvals and marketing strategies. For us at Canopy, we have a need to analyse financial data and statements for aggregation purposes which is what we will focus on here.

    While most companies have embraced this digital age, old habits die hard and we can still find a lot of paper trails: orders, invoices, bank statements, investment analysis etcetera. As a user of the data this can be very frustrating; we need to process our data for purposes of analysis, validation, accounting or even regulatory requirements.

    Therefore, our dream is open standards and the willingness of counterparties to provide us with ‘our data’ in a proper format. Ravi Menon, Managing Director at the Monetary Authority of Singapore (MAS) emphasized in a recent speech the importance of open data standards: Common standards help against fragmentation, inefficiency and inconvenience; seamless data sharing will enable higher quality of data which is free of error and commonly understood. It will also allow the aggregation of data and make it intelligible, meaning: machine readable and machine useable.

    Yet companies still work with paper or its modern day derivative: the PDF file.

    PDF documents are great: every computer can open them and the reader can see exactly what the sender of information intended to show. A major advantage over any text document format which may or may not take the liberty of changing the formatting upon opening and wreak havoc to any nicely crafted design. PDFs are therefore stable display platforms avoiding any issues relating to PEBKAC –

    a popular term describing ‘Problem Exists Between Keyboard And Chair’.

    And yet, PDF documents are terrible: they do not fulfil the requirement of being machine readable. While you can open the document for visual inspection or printing, you are not able to further process that information. The millstone around the neck of your data; you may have it but using it in any substantial way is a chore. That is because it is a document format and not a data exchange standard. The recipient is therefore stuck with data that is not machine useable.

    Why are PDF documents so popular then? Apart from the stability of the viewing experience, senders of information see the non-machine-readability as an advantage. As the data cannot easily be used any further a perception of safety is created.  Comparisons are harder to draw, insights are harder to gleam and you will find it difficult to bring your dataset to a competitor. It’s the placeholder for analog technology in a digital world, a neat replacement for paper.

    What can be done with the quantities of data that we would like to use? We need to bring it back into a real electronic format, that much is clear. Many companies choose the hard way: manually re-enter the relevant items. Users of large data amounts, especially when faced with repeat processes like monthly statements, revert to outsourcing: a back office in a remote part of the world that will do the grunt work. A solution that will fail the test of true scalability as well as accuracy, not to mention security concerns.

    " PDF extraction is only a temporary solution on the path of making data feeds ubiquitous."
     
    At Canopy, we are heavy users of PDF data and we have an overarching requirement for accuracy as well as privacy. The inevitable solution is then to electronically read out and re-interpret the pdf statement itself. We call it cracking the statement. Extracting the data is the easy part. PDF conversion solutions are plentiful and can do a good job in the initial extraction. From here the hard part starts: This pile of data needs to be structured and transformed. Table headlines and columns need to be recognized, headers and disclaimers need to be ignored and the content of fields need to be filled. Not very straight forward but with a few bright programming minds the formats can be cracked. Once done there is no more need for the labour intensive outsourcing solution half way around the world. Data is machine readable again, errors become extinct and we can establish straight-through-processing. Additionally, speed of transformation improves significantly and the amount of data points can be increased without worrying if the back office needs expansion. Data becomes information again.

    PDF extraction is only a temporary solution on the path of making data feeds ubiquitous. We hope that companies and especially financial institutions will listen to the gentle nudge from regulators and the call from clients in this matter. Until then, automated extraction is a powerful application for bank customers and beyond. Not only is it hugely accurate and fast, it also saves cost.

    tag

    Financial

    Data Exchange

    Weekly Brief

    loading
    TOP VENDORS
    Top 10 Software Testing Solution Companies - 2019
    TOP VENDORS
    Top 10 Software Testing Consulting/Services Companies - 2019

    Featured Vendors

    Iero

    Keizo Uchida, CEO

    Concurrent Technology Services Inc.

    Tommy Lu, General Manager

    ON THE DECK

    Software Testing 2019

    Top Vendors

    Software Testing 2015

    Top Vendors

    Previous Next

    I agree We use cookies on this website to enhance your user experience. By clicking any link on this page you are giving your consent for us to set cookies. More info

    Read Also

    Digitalization with the use of digital technologies/Improving business through digital technologies

    Digitalization with the use of digital technologies/Improving business through digital technologies

    Wilbertus Darmadi, CIO, Toyota Astra Motor
    How Marco's Pizza Leaned On Technology To Succeed Amid The Pandemic By Quickly Pivoting To Contact-Free Delivery And Curbside Carryout

    How Marco's Pizza Leaned On Technology To Succeed Amid The Pandemic By Quickly Pivoting To Contact-Free Delivery And Curbside Carryout

    Rick Stanbridge, VP & Chief Information Officer, Marco’s Pizza
    Bunnings  Diy Digital Transformation

    Bunnings Diy Digital Transformation

    Leah Balter, Chief Information Officer, Bunnings
    For a Smarter City: Trust the Data, Ignore the Hype

    For a Smarter City: Trust the Data, Ignore the Hype

    Brad Dunkle, Deputy CIO, City of Charlotte
    Smart Community Innovation for the Post Pandemic

    Smart Community Innovation for the Post Pandemic

    Harry Meier, Deputy Cio for Innovation, Department of Innovation and Technology, City of Mesa
    Artificial Intelligence Enriches Personalized Experiences

    Artificial Intelligence Enriches Personalized Experiences

    Josh Goode, Chief Information Officer, Scan Health Plan
    Investing In Data and Ai to Drive Our Success

    Investing In Data and Ai to Drive Our Success

    Françoise Russo, Chief Information Officer, Tabcorp
    Thai Union-Building a Sustainable Business with Digital Enablers

    Thai Union-Building a Sustainable Business with Digital Enablers

    Rajiv Kakar, Group CIO, Thai Union Group PCL.
    Loading...

    Copyright © 2023 APAC CIOoutlook. All rights reserved. Registration on or use of this site constitutes acceptance of our Terms of Use and Privacy and Anti Spam Policy 

    |  Sitemap |  Subscribe |   About us

    follow on linkedinfollow on twitter follow on rss
    This content is copyright protected

    However, if you would like to share the information in this article, you may use the link below:

    https://software-testing.apacciooutlook.com/ciospeaks/is-the-pdf-the-culdesac-of-data--nwid-586.html