Transcript for:
Deep Learning Projects and Concepts Overview

Title: URL Source: blob://pdf/03e020a0-ed92-4361-8847-9f640ca43e87 Markdown Content: CMPE 258 -01 # Deep Learning Dr. Kaikai Liu, Ph.D. Associate Professor Department of Computer Engineering San Jose State University Email: [email protected] Website: https://www.sjsu.edu/cmpe/faculty/tenure - line/kaikai -liu.php Spring 2025 Project Key Components Option1: Deep Learning Models: Build a full pipeline of Deep Learning application with model training Question/Problem Formulation: propose your own application and formulate the problem Data Acquisition and Processing: Identify a Suitable Dataset for Model Training and Evaluation State -of -the -Art Models Analyze the data, assess and compare various state -of -the -art open source models Model architecture change, Training, Evaluation, Fine -tunning Change the model architecture, tune parameters, and proceed with model training/fine -tunning, and evaluation. Gain the insights of the performance impact from the model architecture and parameters. Inference, Optimization, and Real -time test Create a comprehensive end -to -end deep learning application to enable inference using the trained model. Depending on the chosen hardware platform, optimize model inference to enhance speed or reduce computational costs. Perform evaluations and visualizations using real test data. 2 Question/Problem Formulation Data Acquisition and Processing Model Training, Evaluation, and Fine -tunning Inference, Optimization, and Real -time test Project Key Components Option2: LLMs and AI Agent Question/Problem Formulation: propose your own AI application and formulate the problem Data Acquisition and Processing: For few shot examples and evaluations Compare multiple LLM Models Analyze the LLM model, assess and compare various state -of -the -art LLM models (at least one open source model) At least one of the following features: Parameter -efficient Fine -tunning or Training AI Agent with multiple tools AI search with RAG AI App UI Development and Evaluation Comprehensive test and comparison using multiple test data. > 3 Question/Problem Formulation Data Acquisition and Processing LLM Models (Finetuning, AI Agent) AI Apps and Evaluation Default project Default project 1 : AI Object Detection for Smart City Default project Default project 2 : AI Agent for Airplane Pilot Develop an advanced airplane pilot assistance system powered by a large language model (LLM) agent. The system will provide voice -based interaction capabilities locally, with optional cloud access for enhanced model performance and expanded dataset utilization. The AI agent will deliver five key functionalities: Radio Command Translation: Accurately translate radio commands for pilots. Flight Manual Assistance: Help pilots process and interpret flight manuals. Online Information Retrieval: Access and provide critical online information, such as weather updates. Flight Route Planning: Assist in planning optimal flight routes. Radio Instruction Recording: Record and transcribe radio instructions for reference. Common Datasets Image Classification: ImageNet (https:// www.image -net.org /): A large dataset with millions of labeled images across thousands of categories. CIFAR -10 and CIFAR -100 , MNIST, FashionMNIST Object Detection and Segmentation: COCO (Common Objects in Context): A dataset with images containing objects labeled with bounding boxes and segmentation masks. > https://cocodataset.org/ Autonomous driving related datasets : Kitti , Waymo, Argo, nuScenes Natural Language Processing: IMDB Reviews: A dataset of movie reviews classified as positive or negative sentiment. SQuAD (Stanford Question Answering Dataset): A dataset for reading comprehension tasks, where models answer questions based on a given passage. Machine Translation (WMT19). Speech Recognition: LibriSpeech : A dataset of spoken words and sentences collected from audiobooks. Mozilla Common Voice: A crowdsourced dataset of speech data for various languages. Common Datasets Time Series Analysis: UCI Time Series Data Repository: A collection of time series datasets for different applications, such as finance, health, and meteorology. Video Analysis: Kinetics: A dataset for action recognition in videos with a wide range of human actions. Medical Imaging: MURA (Musculoskeletal Radiographs): A dataset of bone X -rays for identifying musculoskeletal abnormalities. Chest X -Ray Images (Pneumonia): A dataset for diagnosing pneumonia using chest X -ray images. Autonomous Driving: KITTI Vision Benchmark Suite (https:// www.cvlibs.net /datasets/ kitti /) : A dataset for tasks like object detection, tracking, and scene understanding in autonomous driving scenarios. Waymo open dataset: https://waymo.com/open/ Argo open dataset: https://www.argoverse.org/ NuScenes : https://www.nuscenes.org/ Project Requirement Embark on an engaging exploration of a topic that captivates your interest. Limit one topic per group of four teams or fewer. While individual projects are encouraged, teams of 1-3 members are also welcomed. Should your group exceed 3 members, an additional project design and task distribution document (>1 page) must be submitted. The project encompasses three vital milestones: A concise, one -page (single -spaced) project proposal. An engaging presentation of your project to the class. A comprehensive final project report, accompanied by code and a demonstration video. The assessment criteria encompass: The importance of the problem addressed, the novelty of the solutions proposed, technical excellence, the degree of complexity, creativity, code clarity, presentation finesse, and documentation quality. Each project will be individually graded, ensuring a fair evaluation of efforts. Additional Requirements It's advisable to avoid selecting a Basic Deep Learning tutorial as your project. There are numerous tutorials and online examples available for such projects. Copying existing tutorials or sample code is prohibited for the project. Your project must incorporate substantial modifications in terms of implementation, model selection, or inference optimization. Refrain from utilizing commercial APIs lacking open source code. While you can employ commercial APIs for comparative purposes, they shouldn't be your primary model (you need to identify open source alternatives). Strive to delve into innovative solutions and engage with advanced models. Avoid relying solely on basic solutions to fulfill the project demonstration and meet minimum requirements . If advanced approaches do not yield desired outcomes, rest assured there won't be penalties, provided you can effectively present your diligent efforts. Computing Resources Option1: Use your own machine Python environment, Jupyter notebook or Jupyter Lab: free and flexible, limited computing capability Option2: Cloud resources Google Colab : for testing and play with small data > https://colab.research.google.com > Pro version ($10/month): https://colab.research.google.com/signup Google Cloud, AWS Cloud, Azure Cloud: powerful but expensive Third party GPU cloud startups, ~$1/hour You are responsible for covering the costs of cloud resources used for your personal usage through your own credit card. Please note that the department and our course do not have any contractual agreements with these service providers. Option3: SJSU CoE HPC NVIDIA P100 GPU access: powerful but limited permission and resources All these computing resources are provided with courtesy; however, there is no guarantee of resource availability. GPU/Computing Resources Shortage The SJSU College of Engineering provides HPC resources as a courtesy, but availability is not guaranteed. Students are responsible for exploring alternative computing options for their homework and projects . In some cases, students may be required to cover computing costs if they choose to use external platforms, including any cloud services or the paid version of Colab . Students cannot use the unavailability of computing resources as an excuse for not completing their work. Deep Learning Thinking Building the model is 'more similar to training a dog than to ordinary programming. Unlike ordinary software, our models are massive neural networks. Their behaviors are learned from a broad range of data, not programmed explicitly. Though not a perfect analogy, the process is more similar to training a dog than to ordinary programming. An initial pre -training phase comes first, in which the model learns to predict the next word in a sentence, informed by its exposure to lots of Internet text (and to a vast array of perspectives). This is followed by a second phase in which we fine -tune our models to narrow down system behavior. // OpenAI Deep Learning History The theory of AI is fairly old The backpropagation algorithm, for example, was invented more than 40 years ago, and the first computation model for neural networks was proposed for the first time almost 80 years back. AI Winters: the general public lost interest and funding for AI research dried out. The history of AI has been a chain of boom -and - bust cycles. Were currently immersed in the third cycle of optimism Deep Learning History The Birth of AI (1943 1956) 1943: Warren McCulloch and Walter Pitts, proposed a model of artificial neurons In 1949, Donald Hebb introduced the Hebbian learning rule Year 1950: The Turing Test The term Artificial Intelligence was officially coined in 1956 by American computer scientist John McCarthy during the Dartmouth Conference. Establishing AI as a distinct area of research and study. At this time, high -level computer languages like FORTRAN, LISP, and COBOL were also invented, fueling enthusiasm for AI research and development. The first AI boom took place in the late 50s and 60s when efforts focused on answering if machines could actually think The search for the so -called general or strong AI The invention of the perceptron (an early example of artificial neuron or machine learning classifier) in 1957 by Frank Rosenblatt was, for some, an unambiguous indication that general or strong AI was very close. Deep Learning History Year 1969: Limitations of Perceptron was released: it could not learn to solve problems that were not linearly separable. Year 1972: WABOT -1 The First Humanoid Robot The AI Winter of 1973 -1980 UK Parliament analyze the state of AI research after two decades of disappointing progress in AI (and specifically in Machine Translation): Lighthill Report DARPA's frustration with the Speech Understanding Research program at Carnegie Mellon University DARPA shifting its focus on mission -oriented, actionable research which led many AI research groups to lose critical funding Deep Learning History The AI in the limelight again (80s) The emergence of expert systems: the developments revolved around the idea of creating knowledge bases that an inference engine (following logical rules) used to answer questions about a specific domain of knowledge, e.g. medical diagnosis. > represented mainly as if -then rules rather than through conventional procedural code Recurrent neural networks and the backpropagation algorithm (1986) were also developed. The Second AI Winter (1987 1993) The computational power at the time hampered remarkable improvements which brought the second AI winter. 1987: collapse of the LISP machine market (general -purpose computers designed to efficiently run Lisp as their main software and programming language) 1988: cancellation of new spending on AI by the Strategic Computing Initiative 1993: resistance to new expert systems deployment and maintenance Deep Learning History In the 90s, a new vision brought fresh air to AI Moravec's paradox is the observation in artificial intelligence and robotics that, contrary to traditional assumptions, reasoning requires very little computation, but sensorimotor and perception skills require enormous computational resources. The more difficult tasks , however, were indeed those that we do innately, almost effortlessly, like recognising faces and moving around. > They advocated building intelligence from the bottom up and taking into account the role of > the body in human intelligence. > Consequently, the quest for general AI lost momentum and efforts were redirected to solve > specific isolated problems . This gave rise to the so -called narrow or weak AI. It was also the time when advanced ML algorithms like Support Vector Machines, Random Forests, and the area of Reinforcement Learning were developed. Deep Learning History The term "Deep Learning" was introduced to the machine learning community by Rina Dechter in 1986 and to artificial neural networks by Igor Aizenberg and colleagues in 20001. It refers to the use of multiple layers (hence "deep") in neural networks to model complex patterns in data. Deep Belief Network among the first successful deep learning models. While everybody moved to the algorithms like SVM and all, Geoffrey Hinton still believed that true intelligence would be achieved only through Neural Networks. So for almost 20 years i.e. from 1986 to 2006, he worked on neural networks. And in 2006 he came up with a phenomenal paper on training a deep neural network. This is the beginning of the era known as Deep Learning. This paper by Geoffrey Hinton did not receive much popularity until 2012. > https://www.cs.toronto.edu/~hinton/absps/fastnc.pdf ## Deep Learning Milestones MILESTONES IN THE DEVELOPMENT OF NEURAL NETWORKS Current AI Spring The successes of the current "AI spring" or "AI boom" are advances in language translation (in particular, Google Translate), image recognition (spurred by the ImageNet training database) as commercialized by Google Image Search, and in game -playing systems such as AlphaZero (chess champion) and AlphaGo (go champion), and Autonomous Driving. Year 1997: IBM Deep Blues Triumph Year 2000: Google Search uses AI Year 2002: AI in Homes Roomba Year 2005: DARPA autonomous driving challenge: Stanford Stanley By 2006, AI had made its way into the business world. Companies like Facebook, Twitter, and Netflix began using AI algorithms to improve user experience, personalized content, and recommendation systems. Year 2006: Neural Networks into Deep Learning Year 2009: Introduction to ImageNet Current AI Spring Year 2009 -2015: Google self -driving car Year 2012: AlexNet Year 2012: Google Now Predictive AI Year 2013: Deep Learning used to Understand words (Word2Vec) Year 2014: AlphaGo Year 2015: TensorFlow was built for DL Year 2016: DeepMinds AlphaGO defeated Champion Year 2015: Tesla Autopilot Year 2018: Waymo(Self Driving Car) Year 2022: the release of OpenAI's AI chatbot ChatGPT has reinvigorated the discussion about artificial intelligence and its effects on the world. New AI winter could be triggered by overly ambitious or unrealistic promises by prominent AI scientists or overpromising on the part of commercial vendors. Deep Learning Fathers of the Deep Learning Revolution Receive ACM A.M. Turing Award Bengio , Hinton and LeCun Ushered in Major Breakthroughs in Artificial Intelligence https://www.acm.org/media - center/2019/march/turing -award -2018 Yoshua Bengio is a Professor at the University of Montreal Geoffrey Hinton is VP and Engineering Fellow of Google, Chief Scientific Adviser of The Vector Institute and a University Professor Emeritus at the University of Toronto. Yann LeCun is Silver Professor of the Courant Institute of Mathematical Sciences at New York University, and VP and Chief AI Scientist at Facebook AlexNet won the ImageNet challenge in 2012 Geoffrey Hinton and his students Ilya Sutskever (MS 2007, Phd 2013) In 2012, Sutskever built AlexNet in collaboration with Hinton and Alex Krizhevsky . Krizhevsky and Sutskever joined Hinton's new research company DNNResearch , a spinoff of Hinton's research group. In March 2013, Google acquired DNNResearch for $5 million, shortly after winning the contest (after awarding the team of three $600,000 for their work in neural networks and language and image processing) Google Brain Baidu, Microsoft, DeepMind (acquired by Google in 2014) all want to buy a University of Toronto startup that studies neural networks. The one -year -old company is launched by computer science professor Geoffrey Hinton (right) and two of his graduate students, Alex Krizhevsky and Ilya Sutskever (left). AlexNet won the ImageNet challenge in 2012 Geoffrey Hinton and his students Hinton retired from Google in 2023. Alex left Google in September 2017 after losing interest in the work, to work at the company Dessa in support of new deep -learning techniques. He is the creator of the CIFAR -10 and CIFAR -100 datasets At the end of 2015, Ilya Sutskever left Google to become cofounder and chief scientist of the newly founded non -profit organization OpenAI The actual collected total amount of contributions of OpenAI was only $130 million until 2019, $100 million from Musk. Sutskever was formerly one of the six board members of the non -profit entity which controls OpenAI In 2023, people speculated that the firing of Sam Altman in part resulted from a conflict over the extent to which the company should commit to AI safety. Following these events, Sutskever stepped down from the board of OpenAI Deep Learning From basic Neuron to AlexNet Neural Network Neural Network Neural networks are a class of machine learning algorithms used to model complex patterns in datasets using multiple hidden layers and non -linear activation functions. A neural network takes an input, passes it through multiple layers of hidden neurons (mini -functions with unique coefficients that must be learned), and outputs a prediction representing the combined input of all the neurons. 31 # Thank # You Address: ENG257, SJSU Email Address: [email protected]