Transcript for:
Top 40 Python Libraries Explained - Kite

hey there it's kalyn from kite the ai powered coding assistant with over 250 libraries in python it can be a bit confusing to know which one is best for your project so in this video i'll be going through the top 40 libraries that i think you should know about natural language processing is a field that combines linguistics and computer science it allows computers to process and analyze language the natural language toolkit or nltk is one of the most popular natural language processing libraries it allows you to perform a variety of operations on the english language like tokenizing tagging and stemming you can tokenize words or sentences which simply separates words in a sentence or sentences in a paragraph for example tokenizing this sentence about new york muffins outputs a list of strings with each word as a separate string now you can tag each word of a tokenized sentence with a part of speech label this will output a tuple for each word with the word followed by its part of speech nnp which stands for proper noun singular is the tag for john you can also stem words the stem of a word maximum is maximum and the stem of the word presumably is presumed it's important to note that there are several different methods to stemming and each will produce different outputs based on its unique algorithm by combining the basic functionalities of nltk you can develop more complex programs like stock site gensim is another python natural language processing library its target audience are the natural language processing and information retrieval communities it comes with a simple interface memory independent algorithms and efficient multi-core implementations of popular algorithms like lsa lda and rp gensim is simple and easy to pick up and also comes with extensive documentation and jupiter notebook tutorials another important aspect of natural language processing is searching and replacing words flash text is the perfect library for this it allows you to extract replace and remove keywords in a given text data one huge benefit of flash text is its speed by using a tree data structure flash text is able to perform super fast on large pieces of data you can see it outperforms regex for text larger than 500 words and it's significantly faster for even larger text however one thing to note is that flash text is unable to search special characters flash text is the go-to library for large data computer vision is a field where computers identify classify and react to objects visually opencv also known as open source computer vision is the largest computer vision library some of its useful functions include reading and writing images at the same time detecting edges and filtering images by combining the various functions of opencv you can create programs like this face detector not only can it detect where human faces are located it can differentiate and identify the name of the person and even apply makeup simple cv is a beginner-friendly open source framework for building computer vision applications it's basically opencv but for beginners it allows you to access several high-powered computer vision libraries including opencv but without having to first learn about computer vision in detail a pedestrian walk sign program is a project you can try out to get started in computer vision the program will tell you to go unless it detects a light source when a light source is detected the program will display a stop sign one downside is that it only works with python 2.7 but it's still worth trying out for beginners the graphical user interface is a system of interactive visual components for computer software and it's often referred to as gui the tk inter package is the standard python interface to the tk gui toolkit python when combined with tk inter provides a fast and easy way to create gui applications and there are a handful of widgets in tk enter like frames labels and buttons each of these widgets have several attributes like size padding borders and you can create these widgets and customize their attributes to create gui applications in python i recommend this library for simple and fast projects wx python is a gui toolkit for the python language that's commonly used as an alternative to tk inter it's a great choice for cross-platform python since it supports windows mac and linux on top of that wx python is easy to use and offers a sophisticated design layout for developers outliker is one program developed using wx python that stores notes in a tree pi qt is another cross-platform gui worth mentioning and it has the most flexibility out of all the gui libraries making it the best for complex projects in addition to its rich collection of widgets pyqt includes a fully functional web browser a help system and supports unicode regular expressions sql databases and xml you can create things like calculators weather apps and even cryptocurrency market trackers using pi qt kite is a free plugin for your code editor that uses machine learning to save you keystrokes while you're programming so if you're using atom vs code spider pycharm sublime or vim kite will seamlessly integrate into your coding workflow kite can complete entire lines of code and it has a feature called intelligent snippets that will help you fill in arguments and method calls with variables defined earlier in your script the window on the right side of my screen here is also a kite feature called the kite co-pilot it automatically shows you relevant python documentation while you type based on your cursor location this saves you time from having to google search for docs the best part of kite is that it's free and you can download it from the link in the description below you've probably had to create some sort of game at least once during your programming journey whether it be classics like pong and tetris or a game from your own imagination these libraries will allow you to create the game of your choice pygame is a super easy to learn rapper module for writing video games it contains computer graphics and sound libraries allowing you to create dynamic games fast programs written with pygame are compatible with all sdl supported operating systems and can also run on android and tablets features like pixel camera manipulation midi and collision detection are also supported you can use pygame to create games like space shooter and t-rex rush and if you ever need inspiration you can check out the pie game website for thousands of games others have created if you want to create a 3d game piglet is your go-to library unlike pygame piglet is capable of creating three-dimensional guise on top of that piglet has no external dependencies or installation requirements it allows you to use as many windows as you need and loads images sound music and video in almost any format pi engine 3d is an open source python 3d engine that can be used to create stunning 3d graphics like these here these are the top web related libraries that perform functions like http requests web scraping parsing and creating web apps requests is the most popular python http library and it's used to send http requests it has tons of features and is especially great for beginners you can add parameters headers multi-part files and form data to http requests this program called lassie uses the request library to retrieve basic content from websites for example if you input the url of a youtube video it will retrieve information like the title description and keywords scrapy hence the name is a web scraping library and is used for extracting the data you need from websites it's mainly used for creating web crawling programs initially it was designed for just scraping but now it's used for data mining and automated testing as well tons of companies use scrapy to conduct business for example career builders scrapes job postings from many sites parsley scrapes articles from hundreds of new sites and lish uses scrapey to crawl and scrape fashion websites beautiful soup is another library commonly used for web scraping however it's also great for parsing it can parse different broken html and xml elements it offers an easy way for web scraping by extracting direct data from html it's very easy to use making it perfect for beginners an interesting project that relies on beautiful soup is this sports prediction project it scrapes all sorts of sports stats to make predictions on upcoming games zappa makes it super easy to build and deploy serverless event driven python applications on aws lambda and api gateway it's basically a serverless web hosting for python apps it comes with infinite scaling zero downtime and zero maintenance its minimal cost is one of the best features too since you only pay based on the amount of requests you serve it saves you a lot of money django is a very popular python based free and open source web framework its main focus is to ease the creation of complex database driven websites django takes care of features like user authentication content administration site maps and rss feeds django is fast secure scalable and versatile making it an attractive framework used by many businesses today some big companies that use django include instagram spotify youtube robinhood and pinterest flask is another very popular web framework often compared to django it's newer and is more popular than django based on the number of projects more specifically it's a lightweight web server gateway interface it's a bit more flexible than django and comes with url routing request and error handling templating cookies support for unit testing a debugger and a development server large companies like airbnb netflix lyft patreon and uber use flask both django and flask are great frameworks but it's ultimately up to you to decide which one fits better for your project i'd recommend django for heavy complex websites and flasks for simple small websites here are some must-know libraries that have to do with math numpy provides numerous advanced math functionalities and is best suited for arrays and matrices it's fast and efficient making it completely capable of handling large amounts of data numpy also supports logical shape manipulations discrete fourier transforms and general linear algebra functionalities scipy goes hand in hand with numpy and is commonly used for machine learning and image manipulation it provides many user-friendly and efficient numerical routines such as routines for numerical integration interpolation optimization linear algebra and statistics if you ever need help the supportive community of scipy is always there to answer your regular questions and solve any issues simpai is another essential library for mathematics it can perform basic operations like basic arithmetic simplifications and trigonomic functions however it's capable of much more than that like taylor series matrix and versions and cryptography many programs like spider and kenpai are based on senpai spyder is a scientific python development environment or ide and you can think of it as a python equivalent to rstudio kenpai contains functions like an equilibrium solver that's useful for chemistry data science is a hot field that aims to extract knowledge and insights from data pandas is a must for anything data science it allows you to easily organize explore represent and manipulate data one huge plus is its clean and well organized code making it beginner friendly some features beyond the basics include the capability to read and write data in different web services data structures and databases and also easy organization and data labeling using smart alignment and indexing orange is an open source machine learning and data visualization software that uses pandas it comes with countless useful features for both beginners and experts sql alchemy is the python sql toolkit and object relational mapper that gives application developers the full power and flexibility of sql databases it's a bit more specific in that it's for sql but is very useful it makes communication between python and databases easier and faster it features a core that often makes its orm optional in a mature high performing architecture if you want to visualize your data as a graph matplotlib is the perfect library you can create almost any type of graph or plot desired such as histograms stream plots pie charts scatter plots and polar plots matplotliv has an active issue tracker page on github where you can keep up with the most recent bugs new patches and feature quests plotly is another library for making graphs but it's a little more advanced than matplotlib it's best for creating elaborate plots more efficiently it has great support for complex and multi-axis integrated zoom filter out tools and is able to create three dimensional plots i'd say plotly is best for those who are already familiar with matplotlib and are looking for ways to build more complex visuals more efficiently scikit-learn is an open source commercially usable python library for working with complex data it has six main components classification identifying which category an object belongs to regression predicting a continuous valued attribute associated with an object clustering automatic grouping of similar objects into sets dimensionality reduction reducing the number of random variables to consider model selection comparing validating and choosing parameters and models and finally pre-processing feature extraction and normalization imbalanced data sets describe situations where class distribution is not uniform among the classes and can lead to big problems if not accounted for properly for example the classification model you are working on has an accuracy of 80 however you discover that eight percent of the data belongs to one class imbalance learn is a python package that offers a number of resampling techniques commonly used for correcting imbalanced data sets like this it's compatible with scikit-learn and is part of the scikit-learn projects thiano is a python library that allows you to define optimize and efficiently evaluate mathematical expressions it has a tight integration with numpy transparent use of a gpu and extensive unit testing and self-verification bokeh is yet another data visualization library similar to matplotlib but it's fundamentally different because it uses html and javascript to provide its graphics this makes bokeh a reliable platform for web-based dashboards and applications its high flexibility allows you to convert visualizations written in other libraries like matplotlib also its straightforward commands make visualizing your data simple and sweet pi mc3 is a program for bayesian statistical modeling and probabilistic machine learning that uses theano one machine learning technique for classification and regression problems is gradient boosting it produces a prediction model in the form of an ensemble of weak prediction models like decision trees light gbm is a framework for gradient boosting that uses tree based learning algorithms by growing its tree leaf-wise instead of depth-wise light gbn is distributed and efficient it features faster training speeds lower memory usage and better accuracy le5 is a python package which helps you debug machine learning classifiers and explain their predictions it supports numerous packages like scikit-learn keras and light gbm it's great for checking over your work since le5 inspects model parameters and tries to figure out how the model works globally it also inspects an individual prediction of a model trying to figure out why the model makes the decision it makes and fun fact le5 stands for explain like i'm five years old machine learning is a field of artificial intelligence that is very popular in the world today keras is an open source deep neural network library it was developed with a focus on enabling fast experimentation it allows for easy and fast prototyping supports both convolutional neural networks and recurrent networks and runs seamlessly on cpu and gpu it's capable of running on top of tensorflow cntk and thano you can also build deep models for java virtual machine and smartphones on both android and ios this program that automatically converts a given image to html markup code was built using keras whether you're an expert or a beginner tensorflow is an end-to-end platform that makes it easy for you to build and deploy ml models easy model building using intuitive high level apis like keras makes for immediate model iteration and easy debugging you can train and deploy your models anywhere the cloud the browser or on device one of the tensorflow's best features is its simple yet flexible architecture that makes converting ideas to code headache free both the image to code converter and this program that generates bounding boxes around objects in an image were built using tensorflow pytorch is a python package that provides two high-level features tensor computation with strong gpu acceleration and deep neural networks built on a tape-based auto-grad symptom you can think of it as numpy plus higher-level functionality for building deep neural networks its biggest users include facebook's ai research group and uber's pyro software's team there's also a very active pi torch disqus community that you should check out these features allow you to use numpy functions within it perform computations much faster than on a cpu and detect and diagnose many types of errors here are some unique libraries that i just couldn't leave out twisted is an event-based framework for internet applications it comes with many modules each with various purposes twisted.web is for http twisted.conch is for ssh v2 and telnet and twistedoutwords is for irc xmpp and other im protocols some real world applications of twisted include the scrapy library from earlier and popular streaming platform twitch ipython short for interactive python is a command line shelf for interactive computing it works on most operating systems including windows mac and linux interactive shells a browser-based notebook interface and other tools for parallel computing are some of its biggest features overall ipython provides a rich architecture to its users an interesting fact is that ipython is the kernel for the very popular project jupiter pillow is a fork of pil the python imaging library and adds support for opening manipulating and saving images it's sometimes referred to as the modern pil as it's still maintained and updated pillow supports a lot of file types like pdf png jpeg gif and ico you can blur images convert formats and create watermarks additionally there's an active and supportive community where you can ask and get your questions answered you might want to check out poetry if your project depends on several libraries poetry allows you to manage python packaging and dependencies more easily projects you create with poetry can easily be published also an exhaustive dependency resolver will always find a solution to comprehensive dependencies in your projects if they exist if there's no solution a detailed explanation will be provided pywin32 is a python library that provides some useful methods and classes for interacting with windows it allows you to easily access the component object model of windows os and control microsoft applications via python you can do things like opening a file in excel attaching an excel file to outlook and copying data into excel if you want anything to do with building android apps you should check out kivy it's an open source python library for rapid development of applications that make use of innovative user interfaces like multi-touch apps kivy is cross-platform capable and runs on linux windows osx android ios and raspberry pi you can run the same code on all supported platforms its mit license makes kivy free to use and you are even allowed to use it for a commercial product pendulum is a python package designed to ease date time manipulation drop-in replacement for the standard date time and time delta classes makes manipulating your date times convenient time delta durations are aware of the date time instances that created it and time zone manipulation is made easy through automatic transition switching when shifting time additionally special care has been taken to ensure time zones are handled correctly and are based on the underlying tz info implementation log guru is a library which aims to bring enjoyable logging in python next time you decide to use a print statement instead of logging you should consider using log guru instead it's intended to make python logging less painful by adding a bunch of useful functionalities that solve caveats of the standard loggers some include customizable levels color for logs and a unified add function that covers handlers logs formatting filtering messages and setting levels i'd really recommend giving this library a shot for your next project as it will help with debugging and maintaining your program now that you know about these libraries you'll be able to choose the best one for your next project did i miss any libraries you think should be on this list well let us know in the comments below i hope you enjoyed the video make sure to subscribe to our channel and check out kite the ai powered autocomplete for python [Music] you