Welcome to Scribd!

0% found this document useful (0 votes)

11 views

N Grams Python Program

Uploaded by

The document loads the Brown corpus from NLTK and preprocesses it by case folding and extracting the vocabulary. It then calculates bigram and trigram counts by sliding a window over the corpus. Finally, it defines a function that takes a sentence as input and suggests the top 3 most probable next words based on the bigram and trigram counts. It provides examples of running this function on sample inputs.

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

N Grams Python Program

Uploaded by

lalitha sri

0% found this document useful (0 votes)

11 views3 pages

Original Description:

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Download as docx, pdf, or txt

0% found this document useful (0 votes)

11 views3 pages

N Grams Python Program

Uploaded by

lalitha sri

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Download as docx, pdf, or txt

Jump to Page

You are on page 1of 3

Search inside document

From nltk.

corpus import brown

From nltk.tokenize import word_tokenize

#Loading the corpus

Corpus = brown.words

#case folding and getting vocab

Lower_case_corpus = [w.lower() for w in corpus]

Vocab = set(lower_case_corpus)

print(‘CORPUS EXAMPLE:’+str(lower_case_corpus[:30])+’\n\n’)

print(‘VOCAB EXAMPLE:’+str(list(vocab)[:10]))

CORPUS EXAMPLE:
[‘the’,’fulton’,’county’,’grand’,’jury’,’said’,’friday’,’an’,’investigation’,’of’,”atlanta’s”,’recent’,’primary’,’electi
on’,’produced’,’’’’,’no’,’evidence’,’’’’,’that’,’any’,’irregularities’,’took’,’place’,’.’,’the’,’jury’,’further’,’said’,’in’]

VOCAB EXAMPLE:[‘drudgery’,’one-
arm’,’growling’,’cutest’,’rain’,’hops’,”network’s”,’expressionists’,’polarization’,’gaussian’]

print(‘Total words in Corpus:’+str(len(lower_case_corpus)))

print(‘Vocab of the Corpus:’+str(len(vocab)))

Total words in Corpus:1161192

Vocab of the Corpus:49815

bigram_counts={}

trigram_counts={}

#sliding through corpus to get bigram and trigram counts

For I in range(len(lower_case_corpus)-2):

#Gettingt bigram and trigram at each slide

bigram = (lower_case_corpus[i],lower_case_corpus[i+1])

trigram = (lower_case_corpus[i],lower_case_corpus[i+2])

#keeping track of the bigram counts

If bigram in bigram_counts.keys():

bigram_counts[bigram]+=1

else:

bigram_counts[bigram]=1
#keeping track of trigram counts

If trigram in trigram counts.keys():

trigram_counts[trigram]+=1

else:

trigram_counts[trigram]=1

print(“Example, count for bigram(‘the’,’king’)is:”+str(bigram_counts(‘the’,’king’)])) Example, count for

bigram(‘the’,’king’)is:51

#Function takes sentences as input and suggests possible words that comes after the sentence

Def suggest_next_word(input_,bigram_counts,trigram_counts,vocab)

#consider the last bigram of sentence

tokenized_input = word_tokenize(input_.lower())

last_bigram = tokenized_input[-2:]

#calculating probability for each word in vocab

Vocab_probabilities = {}

For vocab_word in vocab:

Test_trigram = (last_bigram[0],last_bigram[1],vocab_word)

Test_bigram = (last_bigram[0],last_bigram[1])

Test_trigram_count = trigram_counts.get(test_trigram,0)

Test_bigram_count = bigram_counts.get(test_bigram,0)

Probability = test_trigram_counts/test_bigram_count

Vocab_probabilities[vocab_word] = probability

#sorting the vocab probability in descending order to get top probable words

Top_suggestions = sorted(vocab_probabilities.items(),key=lambda x:x[1],reverse = True)[:3]return

top_suggestions

Suggest_next_word(‘I am the king’,bigram_counts,trigram_counts,vocab)

[(‘james’,0.17647058823529413),(‘of’,0.1568627450980392),(‘arthur’,0.1176470588352941)]

Suggest_next_word(‘I am the king of’,bigram_counts, trigram_counts, vocab)

[(‘france’,0.3333333),(‘hearts’,0.1666666),(‘morocco’,0.0833333)]

Suggest_next_word(‘I am the king of france’, bigram_counts, trigram_counts, vocab)

[(‘and’,0.2666),(‘.’,0.2666),(‘.’,0.2)]

Suggest_next_word(‘I am the king of france and’, bigram_counts, trigram_counts, vocab)

[(‘the’,0.2),(‘germany’,0.1333),(‘some’,0.066667)]

Pathways 1 Answer Key (Rebecca Tarver Chase) (Z-Library)
Document32 pages
Pathways 1 Answer Key (Rebecca Tarver Chase) (Z-Library)
W.Sato
No ratings yet
29 31
Document4 pages
29 31
kimtien201
100% (3)
Parkside Hg05545a
Document16 pages
Parkside Hg05545a
vimann
0% (1)
Temperature Mapping Protocol For RM Quarantine
Document20 pages
Temperature Mapping Protocol For RM Quarantine
mehrdarou.qa
No ratings yet
Visual Perception Intro Tom Cornsweet
Document14 pages
Visual Perception Intro Tom Cornsweet
Taylor Elliott
0% (1)
(Ishida Akira, James Davies) Attack and Defense (Elementary Go Series) Igo Baduk Weiqi
Document250 pages
(Ishida Akira, James Davies) Attack and Defense (Elementary Go Series) Igo Baduk Weiqi
Tikz Krub
100% (3)
The Wheel of The Zodiac
Document3 pages
The Wheel of The Zodiac
Valentin Badea
100% (3)
Guidelines For Writing Research Proposal PDF
Document3 pages
Guidelines For Writing Research Proposal PDF
Salix Matt
No ratings yet
Cambridge Social Science PDF
Document14 pages
Cambridge Social Science PDF
AnneFigueiredo
No ratings yet
2010 PIANC Dredging and Port Construction Around Coral Reefs Report 108-2010 FINAL VERSION LowRes
Document94 pages
2010 PIANC Dredging and Port Construction Around Coral Reefs Report 108-2010 FINAL VERSION LowRes
gana slim
100% (1)
Text Processing
Document16 pages
Text Processing
Nipuni
No ratings yet
BR PRB 2
Document6 pages
BR PRB 2
Pratigya pathak
No ratings yet
Asss 7
Document4 pages
Asss 7
Ashwini Patil
No ratings yet
Week2 N9
Document4 pages
Week2 N9
20131A05N9 SRUTHIK THOKALA
No ratings yet
NLP Projects
Document4 pages
NLP Projects
Joshua David
No ratings yet
Abu Minhaj Farooqi 37560 Os Lab Final Exam
Document9 pages
Abu Minhaj Farooqi 37560 Os Lab Final Exam
Minhaj Farooqi
No ratings yet
Python
Document17 pages
Python
Nishant
No ratings yet
Python Programming and SQL
Document5 pages
Python Programming and SQL
BADLA FF
No ratings yet
Python Project
Document10 pages
Python Project
brunodash1455
No ratings yet
Project
Document8 pages
Project
sudeepn821
No ratings yet
SDLC BAnk Management System
Document11 pages
SDLC BAnk Management System
kartikay patni
No ratings yet
Import Subprocess
Document11 pages
Import Subprocess
monalisha aggarwal
No ratings yet
0418845a-6f39-41c8-927e-232d33ace52e
Document9 pages
0418845a-6f39-41c8-927e-232d33ace52e
gokulvijaygopalrws21a1094
No ratings yet
Natural Language Processing
Document22 pages
Natural Language Processing
sandeepssn47
No ratings yet
Text-Based Adventure Game - Python
Document2 pages
Text-Based Adventure Game - Python
linren2005
No ratings yet
While False If False: @array (4,9,1,3,2)
Document2 pages
While False If False: @array (4,9,1,3,2)
thebhas1954
No ratings yet
Latvian Vocabulary
Document6 pages
Latvian Vocabulary
parajms8778
100% (1)
Assignment 7
Document2 pages
Assignment 7
Ashwini Patil
No ratings yet
Practice Python Programs 2024
Document7 pages
Practice Python Programs 2024
zaidn8724
No ratings yet
Abhiram KS Xii S2 Roll No 03
Document34 pages
Abhiram KS Xii S2 Roll No 03
ABHI CODM
No ratings yet
NLP Expts
Document41 pages
NLP Expts
Abhishek Tiwari
No ratings yet
Bag of Words
Document1 page
Bag of Words
Mohith Kumar Narahari
No ratings yet
Tmpjog73o6d HTML
Document4 pages
Tmpjog73o6d HTML
dnv1981
No ratings yet
Python Course
Document6 pages
Python Course
Matt
No ratings yet
Ryan CS Practical 2
Document35 pages
Ryan CS Practical 2
moiiifitbituser
No ratings yet
Cooller Quakelive Config
Document8 pages
Cooller Quakelive Config
chermak
No ratings yet
Computer Practical Term1
Document13 pages
Computer Practical Term1
Ashree Kesarwani
No ratings yet
Dsbda 7
Document1 page
Dsbda 7
monaliauti2
No ratings yet
Goat Prob
Document3 pages
Goat Prob
Sash Dhoni7
No ratings yet
#Include #Include #Include #Include #Include Void Main
Document3 pages
#Include #Include #Include #Include #Include Void Main
mse231
No ratings yet
091 Al-Shams PDF
Document2 pages
091 Al-Shams PDF
putra_emeraldi
No ratings yet
Code Output
Document13 pages
Code Output
Sreyas Kotha
No ratings yet
Online Coding Course (
Document6 pages
Online Coding Course (
dew
No ratings yet
Javascript - The Good Parts
Document9 pages
Javascript - The Good Parts
magayue
No ratings yet
Computer Science Practical File: Name - Vivek Bhagat Class - XII-A Roll No - 27
Document38 pages
Computer Science Practical File: Name - Vivek Bhagat Class - XII-A Roll No - 27
Uma Bhagat
No ratings yet
Proiect
Document3 pages
Proiect
Simona Alexandra
No ratings yet
Assistant Code Backup
Document10 pages
Assistant Code Backup
Pranav K
No ratings yet
12 TH Updated Record Prog 2022-2023
Document38 pages
12 TH Updated Record Prog 2022-2023
Hemsuta S.B
No ratings yet
Index
Document19 pages
Index
Sarthak Gupta
No ratings yet
Binary File Rec As List and Nested List
Document6 pages
Binary File Rec As List and Nested List
Ace Eyes
No ratings yet
Random Password Generator
Document1 page
Random Password Generator
Ayush Rane
No ratings yet
Knife Fluxus Script
Document24 pages
Knife Fluxus Script
Diego Garcia tecxooo
No ratings yet
N-Gram Response
Document5 pages
N-Gram Response
lalitha sri
No ratings yet
Death Note
Document10 pages
Death Note
qwerty uiop
No ratings yet
20BCP123 - NLP Lab Manual
Document45 pages
20BCP123 - NLP Lab Manual
divyashree jadeja
No ratings yet
Manalang Steven Allen P. - CodeCompilation
Document34 pages
Manalang Steven Allen P. - CodeCompilation
a.manaloto30776
No ratings yet
AI Lab
Document7 pages
AI Lab
sethhemant567
No ratings yet
TCL Assignment 2
Document6 pages
TCL Assignment 2
Nikita Patel086
No ratings yet
Record 8 to 14 (1)
Document33 pages
Record 8 to 14 (1)
Anirudh Ramkumar
No ratings yet
Programming Finalss
Document8 pages
Programming Finalss
Karissa Devon
No ratings yet
Python Example Programs
Document5 pages
Python Example Programs
esmani84
100% (1)
Implementation of Shift Reduce Parser
Document4 pages
Implementation of Shift Reduce Parser
urshari
No ratings yet
Ex 1
Document2 pages
Ex 1
senthil7111
No ratings yet
PRACTICAL FILE CS Armaan Jaiswal
Document26 pages
PRACTICAL FILE CS Armaan Jaiswal
nik
No ratings yet
Automatic Code Generator in Python
Document1 page
Automatic Code Generator in Python
rony roy
No ratings yet
Brian Schmidt: Principles in Data Reduction
Document76 pages
Brian Schmidt: Principles in Data Reduction
Arni Sopianti
No ratings yet
Marilyn Monroe Dissertation Topics
Document4 pages
Marilyn Monroe Dissertation Topics
OnlinePaperWritingServiceUK
100% (1)
Full Download PDF of (Ebook PDF) Conducting Research: Social and Behavioral Science Methods 2nd Edition All Chapter
Document43 pages
Full Download PDF of (Ebook PDF) Conducting Research: Social and Behavioral Science Methods 2nd Edition All Chapter
tajkoiulu
100% (9)
AI ML Lab Program 1
Document4 pages
AI ML Lab Program 1
Santhosh Kumar S
No ratings yet
Project HIMIG: Volume 1, Series 1 School Year: 2021 - 2022
Document1 page
Project HIMIG: Volume 1, Series 1 School Year: 2021 - 2022
Nancy Gelizon
No ratings yet
The Great Delay Analysis Debate
Document56 pages
The Great Delay Analysis Debate
Gustavo Carini
No ratings yet
How To Structure and Organize Your Paper
Document8 pages
How To Structure and Organize Your Paper
Pixeleon
No ratings yet
00040000001B5000
Document9 pages
00040000001B5000
anthonygonzalezr8v12
No ratings yet
Astm A970m
Document7 pages
Astm A970m
Karma Algorithm
No ratings yet
Calculate Cycle Time
Document14 pages
Calculate Cycle Time
Sarath M
No ratings yet
Design Symmetricand Asymmetric Single Circular Split Ring Microwave Sensor Using COMSOLMultiphysics
Document7 pages
Design Symmetricand Asymmetric Single Circular Split Ring Microwave Sensor Using COMSOLMultiphysics
Pascal Sambre
No ratings yet
Codification of Technological Knowledge PDF
Document68 pages
Codification of Technological Knowledge PDF
question answer
No ratings yet
Envi Systems Engineering and MGT Tambong
Document4 pages
Envi Systems Engineering and MGT Tambong
trisha
No ratings yet
Profile - Chemtex Speciality LTD 2021 ETAIPL
Document14 pages
Profile - Chemtex Speciality LTD 2021 ETAIPL
Harish Raghu
No ratings yet
Thomas W. Grisham (Auth.) - International Project Management - Leadership in Complex Environments (2010)
Document401 pages
Thomas W. Grisham (Auth.) - International Project Management - Leadership in Complex Environments (2010)
quynhchi
No ratings yet
Chemistry Exam Paper 1-2 PDF
Document474 pages
Chemistry Exam Paper 1-2 PDF
bader alsaloum
100% (1)
Notes & Samples
Document4 pages
Notes & Samples
rc7cyvxjpm
No ratings yet
Gharsalli 2020
Document19 pages
Gharsalli 2020
AHMED
No ratings yet
Lesson 2 Families of Curve
Document16 pages
Lesson 2 Families of Curve
eula
No ratings yet
Motivation Letter For Job
Document3 pages
Motivation Letter For Job
Sisay Samuel
No ratings yet
Past Year (XRD, Sem, Tem)
Document13 pages
Past Year (XRD, Sem, Tem)
Nurul Aiman Haziqah
No ratings yet
Double Digit Addition Problems: Sheet# 55449
Document2 pages
Double Digit Addition Problems: Sheet# 55449
Angeline Panaligan Ansela
No ratings yet