'Using R tm to find trend between terms/entities
I have a Corpus of text document on the subject of pollutant fate and transport. I did the termdocumentmatrix and term association. However, I would like to find our "trend association" between terms. For example, I would like to find out if more ambient light would increase hydrolysis of a chemicalX. I have already have 'light', 'hydrolysis', 'increase', and 'chemicalX' in the termdomumentmatrix, what is a good way to answer above question I posed? Please note that I have already done the findAssocs among these terms and they are to certain degree positively linked together (all above 0.5).
Please advise. Thanks
Below is the rough tm process I used, please note that I have many other docs and I just made an excerpt of a small text for example:
> require(tm)
> my.docs <- c("These experiments showed that the ordinary and the polarized
+ lights had a stimulating effect on the hydrolytic process, and
+ both of about the same magnitude. When hydrolysis goes on
+ (Curves I and II in Figs. 3 and 4) in the presence of light, a larger
+ amount of the starch substrate is hydrolyzed. The differences
+ between the two curves (ordinary light and polarized light) are
+ quite insignificant; they are of the magnitude of twice the probable
+ error of the mean and so far as it is consistent it can be attributed
+ to the slight differences existing in the spectral composition of the
+ lights.
+
+ The situation regarding the effect of radiation on the starch-
+ diastase system is, in brief:
+ 1. Ordinary light and polarized light, of the same intensity and
+ as closely as possible similar in spectral composition, have the
+ same effect.
+ 2. Light falling on the starch-diastase system as described, increases
+ the rate of hydrolysis over that of the same reaction in the
+ dark.
+ ")
> funcs <- list(tolower, removePunctuation, stripWhitespace, removeNumbers)
> lightC <- Corpus(VectorSource(my.docs))
> lightCC <- tm_map(lightC, FUN=tm_reduce, tmFuns=funcs)
> my.dictionary.terms <- tolower(c("light","hydrolysis","increases","decreases","reduce","starch"))
> my.dictionary <- Dictionary(my.dictionary.terms)
> tdmLight <- TermDocumentMatrix(lightCC, control=list(weight=weightTfIdf, stopwords=stopwords("english"), dictionary=my.dictionary))
> findAssocs(tdmLight, "light", 0.5)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
