If you're using neural networks, I can offer some suggestions. If not, I can't help you too much. So this answer is about neural networks.
Though some may disagree, I believe the main question with deep learning textual analysis is whether the sequence of words is important. The original text parsing neural networks looked at "bags of words," not words in a sequence. Lately word sequence has been of more interest among data scientists, but it can be overemphasized. Bags of words should be tried before tools to read word sequences.
In your example, it looks like overemphasis on sequences is actually messing up your object (concept) encoding (you don't mention any other outcome/prediction/classification your model needs to predict). If I'm reading you correctly, you have three key words – Bobby, library, book. The direction of the book (checked in vs checked out), if relevant, may require a separate library that translates action phrases into direction words. Other than that, you only need bags of words and the correlations among the three you are interested in, though I'm sure you're seeking something more sophisticated than that.
There are other machine learning methods that don't utilized neural network analysis. One advantage of neural networks is that they do not need to produce outcomes (e.g., classification) to be useful – unsupervised and self-supervised neural networks can uncover interesting patterns, such as the Bose-Einstein distribution in some corners of the event space.
Good luck!
------------------------------
Michael Morgan
Managing Director
Morgan Analytics Research Institute
Dallas TX
------------------------------
Original Message:
Sent: 04-04-2023 13:25
From: Carrie Beam
Subject: Text/language parsing help plus front-end developer referral?
Hi all - asking for help on language/text parsing here.
I have a consulting project in which they want to take in a paragraph or so of free-form text, such as a human's dictation of an event or an email describing a situation. We want the computer to be able to extract out meaningful bits from this text, such as who the main agent is, what happened next, and what the final outcome was. The challenge I'm running into is that NTLK/SpacY etc look like they will parse into subject and object, and even do dependent clauses, but 'Bobby checked the book out of the library' and 'The book was checked out of the library by Bobby' and 'The librarian handed Bobby the book' are three examples of three different text cases, all of which we would want to end up as Main Actor = Bobby, the item = book, librarian = helper, and checked out of the library = the action.
Does anybody have any expertise in this that they would be willing to share, or know of a way you might point me?
(We have tried ChatGPT and it's good but not good enough. It tends to make up stuff, or get it slightly wrong.) If it helps, this is a very narrow situation: the business process is much the same day in and day out, just described differently in different words. We don't need to encompass the whole ocean here.
And on a slightly different note, I am also looking for referrals to a freelance computer programmer good with front-end things (think bubble.io, Angular, JavaScript, those sorts of things), to go on top of a client's back-end thing (SQL mostly.) Any referrals would be most appreciated.
-- Carrie
------------------------------
Carrie Beam, Ph.D.
Director, MSBA Analytics Projects
University of California, Davis
Walnut Creek, CA
cbeam@ucdavis.edu
------------------------------