'Is there a way to create a new row with duplicate sentence but expanded acronym under it?

So I have a dataframe with a rows in a column containing sentences with acronyms. I have a list of what those acronyms stand for in two columns in a seperate dataframe.

What I would like to do is, for every cell in that first dataframe's column in which an acronym is used, create a new row underneath it with the same exact sentence except the acronym is now expanded.

I have as input a dataframe with a column and another dataframe with an acronym and it's expansion:

Column 1
I work at the CIA
I work at the NSA
I have worked at both the NSA and CIA
Column A Column B
CIA Central Intelligence Agency
NSA National Security Agency

And what I want to get:

Desired output:

Column 1
I work at the CIA
I work at the Central Intelligence Agency
I work at the NSA
I work at the National Security Agency
I have worked at both the NSA and CIA
I have worked at both the National Security Agency and the Central Intelligence Agency


Solution 1:[1]

I have no idea why you want to add rows to your data frame, but here is an approach to do the interpretation of data using the second Dataframe.

Given two DataFrames defined as follows:

df1 = pd.DataFrame(data=['I work at the CIA', 'I work at the NSA',
'I have worked at both the NSA and CIA'], columns=['Raw_in'])

which yields:

    Raw_in
0   I work at the CIA
1   I work at the NSA
2   I have worked at both the NSA and CIA 

df2 = pd.DataFrame(data=[['CIA', 'Central Intelligence Agency'],
                         ['NSA', 'National Security Agency']], 
                   columns=['Abrev', 'Title'])

Which Yields:

    Abrev   Title
0   CIA Central Intelligence Agency
1   NSA National Security Agency

Define a translation function as follows:

def createNew(schstr, dfx):
    schList = schstr.split(' ')
    keys = dfx['Abrev'].to_list()
    for i, w in enumerate(schList):
        if w in keys:
            schList[i] = dfx[dfx['Abrev'] == w]['Title'].values[0]
    return " ".join(schList)  

And employ the translation as follows:

df1['Results'] = [createNew(x, df2) for x in df1['Raw_in'].to_list()]

This results in adding a column to df1 as follows:

Raw_in Results

0   I work at the CIA   I work at the Central Intelligence Agency
1   I work at the NSA   I work at the National Security Agency
2   I have worked at both the NSA and CIA   I have worked at both the National Security Agency

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 itprorh66