Bill Processing System

Congress.gov to Topic Lake Integration Platform

Python 3.8+ Async/Await GenAI Cloud Storage

System Components

Data Models

classDiagram class PageStats { +int page_number +bool success +float processing_time_seconds +int input_tokens +int output_tokens +int total_tokens +float cost } class PDFInfo { +str type +int index +str filename +str processing_date } class BillMetadata { +int congress +str bill_type +int bill_number +str title +dict sponsor_details +list cosponsor_details +dict latest_action +str origin_chamber } PDFInfo "1" *-- "1" PageStats : contains BillMetadata "1" *-- "1" PDFInfo : contains

Document Creation

flowchart TB A[Bill Number Input] --> B[Fetch Latest Action] B --> C{Has Document?} C -->|Yes| D[Create Document] C -->|No| E[Error: No Document] D --> F[Set Customer Name] F --> G[Add Metadata] subgraph Metadata Structure H[Bill Number] I[Sponsor Details] J[Co-Sponsor Details] K[Latest Action] L[Origin Chamber] M[Congress Number] end G --> H G --> I G --> J G --> K G --> L G --> M

Atom Generation

flowchart TB A[Document] --> B[Extract Pages] B --> C[Create Atoms] subgraph Atom Metadata D[Sort Number] E[Page Number] F[Title Format] end C --> D C --> E C --> F D --> G["100 + (page_num - 1) * 10"] E --> H["Page {number}"] F --> I["Page {number}: {content_title}"]

Updated Data Model

classDiagram class DocumentMetadata { +int bill_number +dict sponsor_details +list cosponsor_details +dict latest_action +str origin_chamber +int congress_number +str customer_name } class AtomMetadata { +int sort_number +int page_number +str title +str content +dict extra_attributes } DocumentMetadata "1" *-- "1..*" AtomMetadata : contains

Made with DeepSite LogoDeepSite - 🧬 Remix