AI Report
AI Report
ADVANCEMENTSINAIRESEARCH:
TEACHINGMACHINESTOSEEANDUNDERSTAND
HarshitJain(13CSU049)Dr.SupriyraPanda
HarshitJuneja(13CSU050)(Professor)
HimaniMalhotra(13csu051)
ACKNOWLEDGEMENT
Weshouldallbethankfulforthosepeoplewhorekindletheinnerspirit.
Foremost, We would like to thank Almighty for making the endeavour towardssuccess.Itgives
us immense pleasure in acknowledging the efforts of the faculty of The NorthCap University.
Theyprovidedtheverybestopportunityatalllevelsthathelpedustocompleteourprojectandto
polish our technical skills. We also express my gratitude to our respected teacher Dr. Supriya
Panda for their intellectual support throughout the course of this project. She is extremely
tremendousandenergeticandherzealforworkhasgivenmeanewdirectionahead.
TableofContents
Abstract4
Objectdetectionandmemorynetworks5
Predictionandplanning8
Conclusion12
References13
Abstract
Fromtexttophotos,throughvideoandsoonVR,theamountofinformationbeinggeneratedinthe
worldisonlyincreasing.Infact,theamountofdataweneedtoconsiderhasbeengrowingbyabout50
percentyearoveryearandhumanwakinghoursaren'tkeepingupwiththatgrowthrate.Thebest
wayIcanthinkoftokeeppacewiththisgrowthistobuildintelligentsystemsthatwillhelpussort
throughthedelugeofcontent.
Totacklethis,AIgroupshavebeenconductingambitiousresearchinareaslikeimagerecognitionand
naturallanguageunderstanding.
Objectdetectionandmemorynetworks
Thefirstoftheseisinasubsetofcomputervisionknownasobjectdetection.Objectdetectionis
hard.Takethisphoto,forexample:
Howmanyzebrasdoyouseeinthephoto?Hardtotell,right?Imaginehowhardthisisfora
machine,whichdoesn'tevenseethestripesitseesonlypixels.Researchershavebeen
workingtotrainsystemstorecognizepatternsinthepixelssotheycanbeasgoodasorbetter
thanhumansatdistinguishingobjectsinaphotofromoneanotherknowninthefieldas
segmentationandthenidentifyingeachobject.Ourlatestsystem,whichwe'llbepresenting
atNIPSnextmonth,cansegmentimages30percentfasterthanmostothersystems,using10x
lesstrainingdata.
Nextmilestoneisinnaturallanguageunderstanding,withnewdevelopmentsinanew
technologycalledMemoryNetworks(akaMemNets).MemNetsaddatypeofshortterm
memorytotheconvolutionalneuralnetworksthatpowerourdeeplearningsystems,allowing
thosesystemstounderstandlanguagemorelikeahumanwould.ThisdemoofMemNetsat
work,readingandthenansweringquestionsaboutashortsynopsisofTheLordoftheRings.
Nowwe'vescaledthissystemfrombeingabletoreadandanswerquestionsontensoflinesof
texttobeingabletoperformthesametaskondatasetsexceeding100Kquestions,anorderof
magnitudelargerthanpreviousbenchmarks.
Theseadvancementsincomputervisionandnaturallanguageunderstandingareexcitingontheir
own,butwhereitgetsreallyexcitingiswhenyoubegintocombinethem.Takealook:
InthisdemoofthesystemwecallVQA,orvisualQ&A,youcanseethepromiseofwhat
happenswhenyoucombineMemNetswithimagerecognition:We'reabletogivepeoplethe
abilitytoaskquestionsaboutwhat'sinaphoto.Thinkofwhatthismightmeantothehundredsof
millionsofpeoplearoundtheworldwhoarevisuallyimpairedinsomeway.Insteadofbeingleft
outoftheexperiencewhenfriendssharephotos,they'llbeabletoparticipate.Thisisstillvery
earlyinitsdevelopment,butthepromiseofthistechnologyisclear.
Predictionandplanning
Therearealsosomebigger,longertermchallengeswereworkingoninAI.Someofthese
includeunsupervisedandpredictivelearning,wherethesystemscanlearnthroughobservation
(insteadofthroughdirectinstruction,whichisknownassupervisedlearning)andthenbeginto
makepredictionsbasedonthoseobservations.ThisissomethingyouandIdonaturallyfor
example,noneofushadtogotoauniversitytolearnthatapenwillfalltothegroundifyou
pushitoffyourdeskandit'showhumansdomostoftheirlearning.Butcomputersstillcant
dothisouradvancesincomputervisionandnaturallanguageunderstandingarestillbeing
drivenbysupervisedlearning.
TheFAIRteamrecentlystartedtoexplorethesemodels,andyoucanseesomeofearlyprogress
demonstratedbelow.Theteamhasdevelopedasystemthatcanwatchaseriesofvisualtests
inthiscase,setsofprecariouslystackedblocksthatmayormaynotfallandpredictthe
outcome.Afterjustafewmonths'work,thesystemcannowpredictcorrectly90percentofthe
time,whichisbetterthanmosthumans.
Anotherareaoflongertermresearchisteachingoursystemstoplan.Oneofthethingswe've
builttohelpdothisisanAIplayerfortheboardgameGo.Usinggamestotrainmachinesisa
prettycommonapproachinAIresearch.Inthelastcoupleofdecades,AIsystemshavebecome
strongerthanhumansatgameslikecheckers,chess,andevenJeopardy.Butdespiteclosetofive
decadesofworkonAIGoplayers,thebesthumansarestillbetterthanthebestAIplayers.This
isdueinparttothenumberofdifferentvariationsinGo.Afterthefirsttwomovesinachess
game,forexample,thereare400possiblenextmoves.InGo,therearecloseto130,000.
WevebeenworkingonourGoplayerforonlyafewmonths,butit'salreadyonparwiththe
otherAIpoweredsystemsthathavebeenpublished,andit'salreadyasgoodasaverystrong
humanplayer.We'veachievedthisbycombiningthetraditionalsearchbasedapproach
modelingouteachpossiblemoveasthegameprogresseswithapatternmatchingsystembuilt
byourcomputervisionteam.ThebesthumanGoplayersoftentakeadvantageoftheirabilityto
recognizepatternsontheboardasthegameevolves,andwiththisapproachourAIplayerisable
tomimicthatabilitywithverystrongearlyresults.
Sowhathappenswhenyoustarttoputallthistogether?Facebookiscurrentlyrunningasmall
testofanewAIassistantcalledM.Unlikeothermachinedrivenservices,Mtakesthingsfurther:
Itcanactuallycompletetasksonyourbehalf.Itcanpurchaseitemsarrangeforgiftstobe
deliveredtoyourlovedonesandbookrestaurantreservations,travelarrangements,
appointments,andmore.Thisisahugetechnologychallengeit'ssohardthat,startingout,M
isahumantrainedsystem:HumanoperatorsevaluatetheAI'ssuggestedresponses,andthen
theyproduceresponseswhiletheAIobservesandlearnsfromthem.
We'dultimatelyliketoscalethisservicetobillionsofpeoplearoundtheworld,butforthattobe
possible,theAIwillneedtobeabletohandlethemajorityofrequestsitself,withnohuman
assistance.Andtodothat,weneedtobuildallthedifferentcapabilitiesdescribedabove
language,vision,prediction,andplanningintoM,soitcanunderstandthecontextbehind
eachrequestandplanaheadateverystepoftheway.Thisisareallybigchallenge,andwere
justgettingstarted.Buttheearlyresultsarepromising.WhensomeoneasksMforhelpordering
flowers,MnowknowsthatthefirsttwoquestionstoaskareWhatsyourbudget?andWhere
areyousendingthem?
10
Onelastpointhere:Someofyoumaylookatthisandsay,Sowhat?Ahumancoulddoallof
thosethings.Andyou'reright,ofcoursebutmostofusdon'thavededicatedpersonal
assistants.Andthat'sthesuperpowerofferedbyaservicelikeM:Wecouldgiveeveryoneof
thebillionsofpeopleintheworldtheirowndigitalassistantssotheycanfocuslesson
daytodaytasksandmoreonthethingsthatreallymattertothem.
11
Conclusion
Researchershavebeenworkingtotrainsystemstorecognizepatternsinthepixelsso
theycanbeasgoodasorbetterthanhumansatdistinguishingobjectsinaphotofrom
oneanother-knowninthe eldas"Segmentation"-andthenidentifyingeachobject.
PredictionandplanningTherearealsosomebigger,longer-termchallengeswe're
workingoninAI.Someoftheseincludeunsupervisedandpredictivelearning,wherethe
systemscanlearnthroughobservationandthenbegintomakepredictionsbasedon
thoseobservations.
ThisissomethingyouandIdonaturally-forexample,noneofushadtogotoa
universitytolearnthatapenwillfalltothegroundifyoupushito yourdesk-andit's
howhumansdomostoftheirlearning.
Inthelastcoupleofdecades,AIsystemshavebecomestrongerthanhumansatgames
likecheckers,chess,andevenJeopardy.
ResearchershavebeenworkingonourGoplayerforonlyafewmonths,butit'salready
onparwiththeotherAI-poweredsystemsthathavebeenpublished,andit'salreadyas
goodasaverystronghumanplayer.
Thisisahugetechnologychallenge-it'ssohardthat,startingout,Misahuman-trained
system:HumanoperatorsevaluatetheAI'ssuggestedresponses,andthentheyproduce
responseswhiletheAIobservesandlearnsfromthem.
WhensomeoneasksMforhelpordering owers,Mnowknowsthatthe rsttwo
questionstoaskare"What'syourbudget?"and"Whereareyousendingthem?"Onelast
pointhere:Someofyoumaylookatthisandsay,"Sowhat?Ahumancoulddoallofthose
things."Andyou'reright,ofcourse-butmostofusdon'thavededicatedpersonal
assistants.
12
REFERENCES
1. http://www.wired.com/2015/10/facebookartificialintelligencedescribesphotocaptions
forblindpeople/
2. https://research.facebook.com/ai
3. http://www.pcmag.com/news/343445/facebooktouseaitodescribephotostoblinduse
rs
13