Notes

Notes - notes.io

talk about api inputs
api outputs
api response
api input types
blob storage
config
in progress, may change - is current state

API info:
Computer Vision API (v3.2)
1. POST Read
- Use this call to perform a Read operation. The Read API is optimized for text-heavy images and multi-page, mixed language, and mixed type documents. The Read operation executes asynchronously. When you call the Read operation, the call returns with a response header called 'Operation-Location'. The 'Operation-Location' header contains a URL with the Operation Id to be used in the second step. In the second step, you use the Get Read Result operation to fetch the detected text lines and words as part of the JSON response. The time for completion of the text extraction process depends on the volume of the text and the number of pages in the document.
Request parameters
language (optional)
string
See https://aka.ms/ocr-languages for list of supported languages.

pages (optional)
string
The page selection only leveraged for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed & e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.

readingOrder (optional)
string
Optional parameter to specify which reading order algorithm should be applied when ordering the extract text elements. Can be either 'basic' or 'natural'. Will default to basic if not specified

model-version (optional)
string
Optional parameter to specify the version of the OCR model used to extract text information for the image/document submitted. Accepted values are: "latest", "latest-preview", "2021-04-12". Defaults to latest if not provided.
Request headers
Content-Type
string
Media type of the body sent to the API.
Ocp-Apim-Subscription-Key
string
Subscription key which provides access to this API. Found in your Cognitive Services accounts.
Request body
Input passed within the POST body. Supported input methods: raw image binary or image URL.

Input requirements:

Supported image formats: JPEG, PNG, BMP, PDF and TIFF.
Please do note MPO (Multi Picture Objects) embedded JPEG files are not supported.
For multi-page PDF and TIFF documents:
For the free tier, only the first 2 pages are processed.
For the paid tier, up to 2,000 pages are processed.
Image file size must be less than 50 MB (4 MB for the free tier).
The image/document page dimensions must be at least 50 x 50 pixels and at most 10000 x 10000 pixels.

responses: 202 - will start soon - The client can query the operation status and result by using the Operation Id from the 'Operation-Location' response header value, a URL. In the following example, the Operation Id is 49a36324-fc4b-4387-aa06-090cfbf0064f, to be used as the ‘operationId’ parameter to the Get Read Results operation, The URL expires in 24 hours.
, 400 - error,

GET GET READ RESULT:
Use this operation to retrieve the status and OCR result of a Read operation. The input is the 'operationId' from the 'Operation-Location' response header returned by the Read operation. In the following example from a Read operation result, the Operation Id is 49a36324-fc4b-4387-aa06-090cfbf0064f, to be used as the ‘operationId’ parameter to the Get Read Results operation.
Request parameters
operationId
string
Id of the Read operation, contained in the Read operation's 'Operation-Location' response header.

Request headers
Ocp-Apim-Subscription-Key
string
Subscription key which provides access to this API. Found in your Cognitive Services accounts.
Response 200
JSON fields in the response body:

Fields Type Description
status String Read operation status. Possible values:
notStarted: The operation has not started.
running: The operation is being processed.
failed: The operation has failed.
succeeded: The operation has succeeded.
If the status is succeeded, the response JSON will further include 'analyzeResult' containing the recognized text, organized as a hierarchy of pages of lines of words.
createdDateTime String The UTC date time the operation was submitted.
lastUpdatedDateTime String The UTC date time the operation status was last updated.
modelVersion String The version of the OCR model leverage to extract the text information from the submitted image/document.
analyzeResult [Object] Text recognition result of the Read operation.
readResults [Object] A list of extracted text result for each page in the input document.
lines [Object] List of text lines. The maximum number of lines returned is 300 per page. The lines are sorted top to bottom, left to right, although in certain cases proximity is treated with higher priority. As the sorting order depends on the detected text, it may change across images and OCR version updates. Thus, business logic should be built upon the actual line location instead of order.
words [Object] List of words in the text line.
boundingBox [Number] Quadrangle bounding box of a line or word, depending on the parent object, specified as a list of 8 numbers. The coordinates are specified relative to the top-left of the original image. The eight numbers represent the four points, clockwise from the top-left corner relative to the text orientation. For image, the (x, y) coordinates are measured in pixels. For PDF, the (x, y) coordinates are measured in inches.
text String The text content of a line or word.
confidence Number Confidence value between 0 and 1 inclusive.
width Number The width of the image/PDF in pixels/inches, respectively.
height Number The height of the image/PDF in pixels/inches, respectively.
angle Number The general orientation of the text in clockwise direction, measured in degrees between (-180, 180].
page Integer The 1-based page number in the input document.
unit String The unit used by the width, height and boundingBox properties. For images, the unit is "pixel". For PDF, the unit is "inch".
language String The input language of the overall document.
appearance Object An object describing the style of the line along with the qualitative confidence score.
style String The general style of the line of text. Possible values:
handwriting: handwritten styled text.
other: other text style.
styleConfidence Number Confidence value between 0 and 1 inclusive.
version String The version of schema used for this result.

1. Overview- show README.
- stress 'in progress, may change - is current state'
2. Initial Setup Notebook
- highlgiht packages of interest - talk about Makefile accelerating this proceess
3. Data Prep Notebook.
- Stress config.
- Stress Blob storage.
4. Use Cognitive OCR Notebook
- Import packages
- Env variables
- PDF process API function
- Analyze Images and PDFs with Azure CV
- Upload output to Azure Blob Storage
- Show what some of the files look like

Notes is a web-based application for online taking notes. You can take your notes and share with others people. If you like taking long notes, notes.io is designed for you. To date, over 8,000,000,000+ notes created and continuing...

With notes.io;

* You can take a note from anywhere and any device with internet connection.
* You can share the notes in social platforms (YouTube, Facebook, Twitter, instagram etc.).
* You can quickly share your contents without website, blog and e-mail.
* You don't need to create any Account to share a note. As you wish you can use quick, easy and best shortened notes with sms, websites, e-mail, or messaging services (WhatsApp, iMessage, Telegram, Signal).
* Notes.io has fabulous infrastructure design for a short link and allows you to share the note as an easy and understandable link.

Fast: Notes.io is built for speed and performance. You can take a notes quickly and browse your archive.

Easy: Notes.io doesn’t require installation. Just write and share note!

Short: Notes.io’s url just 8 character. You’ll get shorten link of your note when you want to share. (Ex: notes.io/q )

Free: Notes.io works for 14 years and has been free since the day it was started.

You immediately create your first note and start sharing with the ones you wish. If you want to contact us, you can use the following communication channels;

Email: [email protected]

Twitter: http://twitter.com/notesio

Instagram: http://instagram.com/notes.io

Facebook: http://facebook.com/notesio

Regards;
Notes.io Team

Notes

Notes - notes.io

Shortened Note Link

Long File

Notes