Skip to content

Document Scoring Pipeline

Document Scoring is useful when a large set of documents are required to be scored against a certain set of features and a scoring criteria. This would help the users to get a scoring for each document in json format. Users can further rank the documents with a custom ranking criteria.

Data Flow

DocScore without Datasource

flowchart LR
    A[Input Text] --> B[WebApp Interface]
    E[Documents to Score] --> |Load|B
    B --> |Extract Features|C[PFT Framework]
    C --> |Extracted Features|B
    B --> |Features + Input Text + Doc Content|D[DocScore Framework]
    D --> |scored output summary|B

DocScore with Datasource

This pipeline is used when you have the document content in a datasource. You can shortlist the documents to score using an LLM generated "search query" for each feature.

flowchart LR
    A[Input Text] --> B[WebApp Interface]
    B --> |Extract Features|C[PFT Framework 1]
    C --> |Features List|B
    B --> |Generate search query \nfor each feature|E[PFT Framework 2]
    E --> |Search Query|B
    B --> |Search Query|F[Datasource]
    F --> |Relevant Document Content|B
    B --> |Features + Input Text + Doc Content|D[DocScore Framework]
    D --> |Scored output summary|B

Doc Score Pipeline Creation

Please use the following steps:

  1. [Optional] Create a Datasource using the files to be scored

  2. Create usecases for every step of the pipeline.

    Example: CV scoring for given Job Description. Refer to the usecase JD-CV-Scoring

    • Usecase 1: Extract Features from given JD (PFT Framework 1)

    • Usecase 2: Generate search query for given feature (PFT Framework 2)

    • Usecase 3: Score given CVs for JD Features and scoring criteria (DocScore Framework)

  3. Create the pipeline_config.json file for each step as follows. Note: datasource is optional.

    {
    "DocScoring": {
        "frameworks": {
            "jd-feature-extraction": "jd-feature-extraction",
            "feature-query-extraction": "feature-query-extraction",
            "jd-cv-scoring": "JD-CV-Scoring",
        },
        "datasource": "cv-datasource"
    },
    }
    
  4. Prepare the CVs to be scored.

  5. Perform scoring using JD-CV-Scoring usecase.