INFORMS Open Forum

  • 1.  Text/language parsing help plus front-end developer referral?

    Posted 04-04-2023 13:25

    Hi all - asking for help on language/text parsing here. 

    I have a consulting project in which they want to take in a paragraph or so of free-form text, such as a human's dictation of an event or an email describing a situation.  We want the computer to be able to extract out meaningful bits from this text, such as who the main agent is, what happened next, and what the final outcome was.  The challenge I'm running into is that NTLK/SpacY etc look like they will parse into subject and object, and even do dependent clauses, but 'Bobby checked the book out of the library' and 'The book was checked out of the library by Bobby' and 'The librarian handed Bobby the book' are three examples of three different text cases, all of which we would want to end up as Main Actor = Bobby, the item = book, librarian = helper, and checked out of the library = the action. 

    Does anybody have any expertise in this that they would be willing to share, or know of a way you might point me? 

    (We have tried ChatGPT and it's good but not good enough.  It tends to make up stuff, or get it slightly wrong.)  If it helps, this is a very narrow situation: the business process is much the same day in and day out, just described differently in different words.  We don't need to encompass the whole ocean here.

    And on a slightly different note, I am also looking for referrals to a freelance computer programmer good with front-end things (think bubble.io, Angular, JavaScript, those sorts of things), to go on top of a client's back-end thing (SQL mostly.)  Any referrals would be most appreciated.

    -- Carrie



    ------------------------------
    Carrie Beam, Ph.D.
    Director, MSBA Analytics Projects
    University of California, Davis
    Walnut Creek, CA
    cbeam@ucdavis.edu
    ------------------------------


  • 2.  RE: Text/language parsing help plus front-end developer referral?

    Posted 04-05-2023 09:41

    Hi Carrie:
    Semantic role labeling (SRL) should be useful for your first task. AllenNLP has an off-the-shelf model demo: 
    https://demo.allennlp.org/semantic-role-labeling/semantic-role-labeling
    You may have to fine-tune it using in-domain data.

    --
    Feng Mai, Ph.D.
    Associate Professor of Information Systems & Analytics
    School of Business
    Stevens Institute of Technology



    ------------------------------
    Feng Mai
    Stevens Institute of Technology
    Hoboken NJ
    ------------------------------



  • 3.  RE: Text/language parsing help plus front-end developer referral?

    Posted 04-06-2023 11:49
    Edited by Yves Rychener 04-06-2023 11:49

    Hi Carrie
    I can only second Feng Mai's answer. As far as I am aware, the tools you originally refer to (NLTK and Spacy) create a syntax tree. However, you are interested in a semantic tree/representation. The tool referred to by Feng Mai seems quite good. If you want to dig deeper into it, I suggest you check out the semantic parsing section on papers with code: https://paperswithcode.com/task/semantic-parsing

    Best,
    Yves



    ------------------------------
    Yves Rychener
    PHD Candidate
    EPFL
    Echandens
    ------------------------------



  • 4.  RE: Text/language parsing help plus front-end developer referral?

    Posted 04-06-2023 14:47

    Hi Carrie,

    For the referral, you could email Deji - a former colleague who does freelance development on the side of his current role at MS . I know he does a lot of work using React, JavaScript and the likes - ejiadedeji@gmail.com.

    --
    Izuwa, Ahanor
    Ph.D. Candidate,
    Industrial and Systems Engineering,
    University of Tennessee.



    ------------------------------
    Izuwa Ahanor
    The University of Tennessee
    Knoxville TN
    ------------------------------



  • 5.  RE: Text/language parsing help plus front-end developer referral?

    Posted 04-06-2023 15:05

    When I started reading this in the digest, I immediately thought of ChatGPT, so when Carrie said they already tried that I moved on.

    <u1:p> </u1:p>Now seeing it again, I still think AI is the most likely path to success. Possibly something specific to text extraction? My first thought is Amazon Comprehend, and I suspect the other major cloud providers have competing tools.

    <u1:p> </u1:p>I'm happy to talk to Carrie directly if that helps.



    ------------------------------
    Mack Earnhardt
    CTO
    Agile Reasoning
    Carmel IN
    ------------------------------



  • 6.  RE: Text/language parsing help plus front-end developer referral?

    Posted 04-06-2023 18:30

    Hi Carrie,
    Following up on GPT, I suggest tying GPT4 with Chain of Thought Prompting.
    https://arxiv.org/pdf/2201.11903.pdf
    Good luck,
    Moses

    Moses Miller, Assistant Professor
    Data Science Department
    Arison School of Business
    Reichman University






    ------------------------------
    Moses Miller
    Assistant Professor
    Reichman University
    Herzliya
    ------------------------------



  • 7.  RE: Text/language parsing help plus front-end developer referral?

    Posted 04-06-2023 21:31
    Hi Carrie,
     
    Classical methods like Semantic Role Labeling and Semantic Dependency Parsing can be helpful in extracting SVO in sentences. They are well-studied and there are some demos online you can find easily by Google. 
     
    But language models like ChatGPT can also be a potential way to do NLP tasks nowadays. You may only need to pay attention to their truthfulness concern. 

    Best,
    Minjia Mao


    ------------------------------
    Minjia Mao
    University of Delaware
    Newark DE
    ------------------------------



  • 8.  RE: Text/language parsing help plus front-end developer referral?

    Posted 04-07-2023 07:14

    If you're using neural networks, I can offer some suggestions.  If not, I can't help you too much.  So this answer is about neural networks.

    Though some may disagree, I believe the main question with deep learning textual analysis is whether the sequence of words is important.  The original text parsing neural networks looked at "bags of words," not words in a sequence.  Lately word sequence has been of more interest among data scientists, but it can be overemphasized.  Bags of words should be tried before tools to read word sequences. 

    In your example, it looks like overemphasis on sequences is actually messing up your object (concept) encoding (you don't mention any other outcome/prediction/classification your model needs to predict).  If I'm reading you correctly, you have three key words – Bobby, library, book.  The direction of the book (checked in vs checked out), if relevant, may require a separate library that translates action phrases into direction words.  Other than that, you only need bags of words and the correlations among the three you are interested in, though I'm sure you're seeking something more sophisticated than that.

    There are other machine learning methods that don't utilized neural network analysis.  One advantage of neural networks is that they do not need to produce outcomes (e.g., classification) to be useful – unsupervised and self-supervised neural networks can uncover interesting patterns, such as the Bose-Einstein distribution in some corners of the event space.

    Good luck!



    ------------------------------
    Michael Morgan
    Managing Director
    Morgan Analytics Research Institute
    Dallas TX
    ------------------------------



  • 9.  RE: Text/language parsing help plus front-end developer referral?

    Posted 04-07-2023 14:59

    Hi Carrie, 
    if I understand your use case very well, it is an SRL problem, as already suggested by some colleagues on this platform.

    A good SRL model that is based on the AllenNLP for your use case could be found here:
    https://github.com/asofiaoliveira/srl_bert_pt

    It is very nicely explained, and easy to follow step-by-step.

    As with most pretrained transformers projects, I would recommend you setup (if you are NOT using any cloud platform) a conda or python virtual environment in any scripting editor of choice (e.g. VS Code). You can then setup a GPU - NVIDIA CUDA for PyTorch (alt. TensorFlow) in your environment (assuming you have a gpu on your compute system), after which you will either conda install/forge and/or pip install any dependencies.  
    This way, once you have trained your model, you can use any Python-based web service e.g. Django or Flask, to deploy the model directly to a webserver (server) and add a client (user interface) for the project. Remember to use any version control management system for the project e.g. GitHub, which can use GitHub Actions to CICD/deploy the project and automate your deployment process, even if you later decide to host your project in the cloud. I know you didn't want to "...encompass the whole ocean here."

    Good luck with your project!

    Frank



    ------------------------------
    Frank Yeboah
    Chief Data Scientist
    YAnalytics
    Meridian ID
    ------------------------------