Skip to content

snxtyle/Data-Extraction-and-Sentiment-Analysis-using-NLP

Repository files navigation

Data-Extraction-and-Sentiment-Analysis-using-NLP

This repository is to extract textual data articles from the given URL and perform text analysis to compute variables that are explained below. For each of the articles, given in the input.xlsx file, extract the article text and save the extracted article in a text file with URL_ID as its file name. While extracting text, please make sure your program extracts only the article title and the article text. It should not extract the website header, footer, or anything other than the article text. Look for these variables for analysis:

1.POSITIVE SCORE

2.NEGATIVE SCORE

3.POLARITY SCORE

4.SUBJECTIVITY SCORE

5.AVG SENTENCE LENGTH

6.PERCENTAGE OF COMPLEX WORDS

7.FOG INDEX

8.AVG NUMBER OF WORDS PER SENTENCE

9.COMPLEX WORD COUNT

10.WORD COUNT

11.SYLLABLE PER WORD

12.PERSONAL PRONOUNS

13.AVG WORD LENGTH

About

In this repository, you will be able to get how to extract text from the title and content from any article. Also using this extractede data to define the sentiment of the sentence.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published