Document Scoring Pipeline
Document Scoring is useful when a large set of documents are required to be scored against a certain set of features and a scoring criteria. This would help the users to get a scoring for each document in json format. Users can further rank the documents with a custom ranking criteria.
Data Flow
DocScore without Datasource
flowchart LR
A[Input Text] --> B[WebApp Interface]
E[Documents to Score] --> |Load|B
B --> |Extract Features|C[PFT Framework]
C --> |Extracted Features|B
B --> |Features + Input Text + Doc Content|D[DocScore Framework]
D --> |scored output summary|B
DocScore with Datasource
This pipeline is used when you have the document content in a datasource. You can shortlist the documents to score using an LLM generated "search query" for each feature.
flowchart LR
A[Input Text] --> B[WebApp Interface]
B --> |Extract Features|C[PFT Framework 1]
C --> |Features List|B
B --> |Generate search query \nfor each feature|E[PFT Framework 2]
E --> |Search Query|B
B --> |Search Query|F[Datasource]
F --> |Relevant Document Content|B
B --> |Features + Input Text + Doc Content|D[DocScore Framework]
D --> |Scored output summary|B
Doc Score Pipeline Creation
Please use the following steps:
-
[Optional] Create a Datasource using the files to be scored
-
Create usecases for every step of the pipeline.
Example: CV scoring for given Job Description. Refer to the usecase
JD-CV-Scoring
-
Usecase 1: Extract Features from given JD (PFT Framework 1)
-
Usecase 2: Generate search query for given feature (PFT Framework 2)
-
Usecase 3: Score given CVs for JD Features and scoring criteria (DocScore Framework)
-
-
Create the
pipeline_config.json
file for each step as follows. Note: datasource is optional. -
Prepare the CVs to be scored.
-
Perform scoring using
JD-CV-Scoring
usecase.