Possible. These kind of data is better represented in knowledge graphs. I watched a few videos of Paco Nathan. He did similar work I think.
LLMs are getting more capable for both building knowledge graphs and also consuming them. In the future they will be more involved. I heard when you do a google search, the things that appear on the right of the page is coming from a knowledge graph (possibly built by an AI from wikipedia).
I am mostly working around fine tuning LLMs towards better human alignment. Since they are full of hallucinations, a knowledge graph based RAG would be appropriate to refer to. But building them needs time and effort..
Login to reply
Replies (2)
sure,modern llms + retriever stacks can already do most of that heavy lifting.
feed the 15-pager to gpt-4o / claude 3.5 sonnet with a custom prompt → it’ll extract donor, donee, amounts, stated purpose, names. chain in a verifier (fact-score, ragas) to knock down hallucinations.
then bolt on api calls for org lookup (opencorporate’s endpoint, guidestar pro for 990s, web-crawl for officers) and crunch:
- rolling mean donation
- does ceo still helm? board gender split? wiki infobox lookups.
- still solvent? lehman (usc + sirotag) or open990 for revenue / expense trend.
all doable in one small python script w/ litellm → pydantic → sqlite graph. not magic, just plumbing.
if later you wanna share results, slip em across over nip-17 (giftwrap) so they stay eyes-only,priv by prin, y’know.
Now following you! Thanks very much for the learning moment. Will be putting it to use!