Title:
URL Source: blob://pdf/03e020a0-ed92-4361-8847-9f640ca43e87
Markdown Content:
CMPE 258 -01
# Deep Learning
Dr. Kaikai Liu, Ph.D. Associate Professor
Department of Computer Engineering
San Jose State University
Email:
[email protected]
Website: https://www.sjsu.edu/cmpe/faculty/tenure -
line/kaikai -liu.php
Spring 2025 Project Key Components
Option1: Deep Learning Models: Build a full pipeline of Deep Learning application with
model training
Question/Problem Formulation: propose your own application and formulate the problem
Data Acquisition and Processing:
Identify a Suitable Dataset for Model Training and Evaluation
State -of -the -Art Models
Analyze the data, assess and compare various state -of -the -art open source models
Model architecture change, Training, Evaluation, Fine -tunning
Change the model architecture, tune parameters, and proceed with model training/fine -tunning, and evaluation.
Gain the insights of the performance impact from the model architecture and parameters.
Inference, Optimization, and Real -time test
Create a comprehensive end -to -end deep learning application to enable inference using the trained model.
Depending on the chosen hardware platform, optimize model inference to enhance speed or reduce computational
costs.
Perform evaluations and visualizations using real test data.
2
Question/Problem
Formulation
Data Acquisition
and Processing
Model Training,
Evaluation, and
Fine -tunning
Inference,
Optimization, and
Real -time test Project Key Components
Option2: LLMs and AI Agent
Question/Problem Formulation: propose your own AI application and formulate the problem
Data Acquisition and Processing:
For few shot examples and evaluations
Compare multiple LLM Models
Analyze the LLM model, assess and compare various state -of -the -art LLM models (at least one open
source model)
At least one of the following features:
Parameter -efficient Fine -tunning or Training
AI Agent with multiple tools
AI search with RAG
AI App UI Development and Evaluation
Comprehensive test and comparison using multiple test data.
> 3
Question/Problem
Formulation
Data Acquisition
and Processing
LLM Models
(Finetuning,
AI Agent)
AI Apps and
Evaluation Default project
Default project 1 : AI Object Detection for Smart City Default project
Default project 2 : AI Agent for Airplane Pilot
Develop an advanced airplane pilot assistance system powered by a large
language model (LLM) agent.
The system will provide voice -based interaction capabilities locally, with optional
cloud access for enhanced model performance and expanded dataset utilization.
The AI agent will deliver five key functionalities:
Radio Command Translation: Accurately translate radio commands for pilots.
Flight Manual Assistance: Help pilots process and interpret flight manuals.
Online Information Retrieval: Access and provide critical online information, such as weather
updates.
Flight Route Planning: Assist in planning optimal flight routes.
Radio Instruction Recording: Record and transcribe radio instructions for reference. Common Datasets
Image Classification:
ImageNet (https:// www.image -net.org /): A large dataset with millions of labeled images across thousands of categories.
CIFAR -10 and CIFAR -100 , MNIST, FashionMNIST
Object Detection and Segmentation:
COCO (Common Objects in Context): A dataset with images containing objects labeled with bounding boxes and
segmentation masks.
> https://cocodataset.org/
Autonomous driving related datasets : Kitti , Waymo, Argo, nuScenes
Natural Language Processing:
IMDB Reviews: A dataset of movie reviews classified as positive or negative sentiment.
SQuAD (Stanford Question Answering Dataset): A dataset for reading comprehension tasks, where models answer questions
based on a given passage.
Machine Translation (WMT19).
Speech Recognition:
LibriSpeech : A dataset of spoken words and sentences collected from audiobooks.
Mozilla Common Voice: A crowdsourced dataset of speech data for various languages. Common Datasets
Time Series Analysis:
UCI Time Series Data Repository: A collection of time series datasets for different applications,
such as finance, health, and meteorology.
Video Analysis:
Kinetics: A dataset for action recognition in videos with a wide range of human actions.
Medical Imaging:
MURA (Musculoskeletal Radiographs): A dataset of bone X -rays for identifying musculoskeletal
abnormalities.
Chest X -Ray Images (Pneumonia): A dataset for diagnosing pneumonia using chest X -ray
images.
Autonomous Driving:
KITTI Vision Benchmark Suite (https:// www.cvlibs.net /datasets/ kitti /) : A dataset for tasks like
object detection, tracking, and scene understanding in autonomous driving scenarios.
Waymo open dataset: https://waymo.com/open/
Argo open dataset: https://www.argoverse.org/
NuScenes : https://www.nuscenes.org/ Project Requirement
Embark on an engaging exploration of a topic that captivates your interest.
Limit one topic per group of four teams or fewer. While individual projects are
encouraged, teams of 1-3 members are also welcomed. Should your group exceed 3
members, an additional project design and task distribution document (>1 page) must be
submitted.
The project encompasses three vital milestones:
A concise, one -page (single -spaced) project proposal.
An engaging presentation of your project to the class.
A comprehensive final project report, accompanied by code and a demonstration video.
The assessment criteria encompass:
The importance of the problem addressed, the novelty of the solutions proposed,
technical excellence, the degree of complexity, creativity, code clarity, presentation
finesse, and documentation quality.
Each project will be individually graded, ensuring a fair evaluation of efforts. Additional Requirements
It's advisable to avoid selecting a Basic Deep Learning tutorial as your
project. There are numerous tutorials and online examples available for such
projects.
Copying existing tutorials or sample code is prohibited for the project. Your
project must incorporate substantial modifications in terms of
implementation, model selection, or inference optimization.
Refrain from utilizing commercial APIs lacking open source code. While you
can employ commercial APIs for comparative purposes, they shouldn't be
your primary model (you need to identify open source alternatives).
Strive to delve into innovative solutions and engage with advanced
models. Avoid relying solely on basic solutions to fulfill the project
demonstration and meet minimum requirements . If advanced
approaches do not yield desired outcomes, rest assured there won't be
penalties, provided you can effectively present your diligent efforts. Computing Resources
Option1: Use your own machine
Python environment, Jupyter notebook or Jupyter Lab: free and flexible, limited computing
capability
Option2: Cloud resources
Google Colab : for testing and play with small data
> https://colab.research.google.com
> Pro version ($10/month): https://colab.research.google.com/signup
Google Cloud, AWS Cloud, Azure Cloud: powerful but expensive
Third party GPU cloud startups, ~$1/hour
You are responsible for covering the costs of cloud resources used for your personal
usage through your own credit card. Please note that the department and our course do
not have any contractual agreements with these service providers.
Option3: SJSU CoE HPC
NVIDIA P100 GPU access: powerful but limited permission and resources
All these computing resources are provided with courtesy; however, there is no guarantee
of resource availability. GPU/Computing Resources Shortage
The SJSU College of Engineering provides HPC resources as a
courtesy, but availability is not guaranteed.
Students are responsible for exploring alternative computing
options for their homework and projects .
In some cases, students may be required to cover computing costs if they
choose to use external platforms, including any cloud services or the paid version
of Colab .
Students cannot use the unavailability of computing resources as an excuse for
not completing their work. Deep Learning Thinking
Building the model is 'more similar to training a dog than to ordinary
programming.
Unlike ordinary software, our models are massive neural networks. Their
behaviors are learned from a broad range of data, not programmed explicitly.
Though not a perfect analogy, the process is more similar to training a dog
than to ordinary programming. An initial pre -training phase comes first, in which
the model learns to predict the next word in a sentence, informed by its exposure
to lots of Internet text (and to a vast array of perspectives). This is followed by a
second phase in which we fine -tune our models to narrow down system
behavior. // OpenAI Deep Learning History
The theory of AI is fairly old
The backpropagation algorithm, for example,
was invented more than 40 years ago, and the
first computation model for neural networks was
proposed for the first time almost 80 years back.
AI Winters: the general public lost interest and
funding for AI research dried out.
The history of AI has been a chain of boom -and -
bust cycles. Were currently immersed in the
third cycle of optimism Deep Learning History
The Birth of AI (1943 1956)
1943: Warren McCulloch and Walter Pitts, proposed a model of artificial neurons
In 1949, Donald Hebb introduced the Hebbian learning rule
Year 1950: The Turing Test
The term Artificial Intelligence was officially coined in 1956 by American computer
scientist John McCarthy during the Dartmouth Conference. Establishing AI as a distinct
area of research and study. At this time, high -level computer languages like FORTRAN,
LISP, and COBOL were also invented, fueling enthusiasm for AI research and
development.
The first AI boom took place in the late 50s and 60s when efforts focused on
answering if machines could actually think
The search for the so -called general or strong AI
The invention of the perceptron (an early example of artificial neuron or machine
learning classifier) in 1957 by Frank Rosenblatt was, for some, an unambiguous
indication that general or strong AI was very close. Deep Learning History
Year 1969: Limitations of Perceptron was
released: it could not learn to solve problems that
were not linearly separable.
Year 1972: WABOT -1 The First Humanoid
Robot
The AI Winter of 1973 -1980
UK Parliament analyze the state of AI research after two
decades of disappointing progress in AI (and specifically
in Machine Translation): Lighthill Report
DARPA's frustration with the Speech Understanding
Research program at Carnegie Mellon University
DARPA shifting its focus on mission -oriented,
actionable research which led many AI research groups
to lose critical funding Deep Learning History
The AI in the limelight again (80s)
The emergence of expert systems: the developments revolved around the idea of
creating knowledge bases that an inference engine (following logical rules) used to
answer questions about a specific domain of knowledge, e.g. medical diagnosis.
> represented mainly as if -then rules rather than through conventional procedural code
Recurrent neural networks and the backpropagation algorithm (1986) were also
developed.
The Second AI Winter (1987 1993)
The computational power at the time hampered remarkable improvements which
brought the second AI winter.
1987: collapse of the LISP machine market (general -purpose computers designed to
efficiently run Lisp as their main software and programming language)
1988: cancellation of new spending on AI by the Strategic Computing Initiative
1993: resistance to new expert systems deployment and maintenance Deep Learning History
In the 90s, a new vision brought fresh air to AI
Moravec's paradox is the observation in artificial intelligence and robotics that,
contrary to traditional assumptions, reasoning requires very little computation, but
sensorimotor and perception skills require enormous computational resources.
The more difficult tasks , however, were indeed those that we do innately,
almost effortlessly, like recognising faces and moving around.
> They advocated building intelligence from the bottom up and taking into account the role of
> the body in human intelligence.
> Consequently, the quest for general AI lost momentum and efforts were redirected to solve
> specific isolated problems . This gave rise to the so -called narrow or weak AI.
It was also the time when advanced ML algorithms like Support Vector Machines,
Random Forests, and the area of Reinforcement Learning were developed. Deep Learning History
The term "Deep Learning" was introduced to the machine learning
community by Rina Dechter in 1986 and to artificial neural networks by
Igor Aizenberg and colleagues in 20001. It refers to the use of multiple
layers (hence "deep") in neural networks to model complex patterns in
data.
Deep Belief Network among the first successful deep learning models.
While everybody moved to the algorithms like SVM and all, Geoffrey Hinton still
believed that true intelligence would be achieved only through Neural Networks.
So for almost 20 years i.e. from 1986 to 2006, he worked on neural networks.
And in 2006 he came up with a phenomenal paper on training a deep neural
network. This is the beginning of the era known as Deep Learning. This paper by
Geoffrey Hinton did not receive much popularity until 2012.
> https://www.cs.toronto.edu/~hinton/absps/fastnc.pdf
## Deep Learning Milestones
MILESTONES IN THE DEVELOPMENT OF NEURAL NETWORKS Current AI Spring
The successes of the current "AI spring" or "AI boom" are advances in language
translation (in particular, Google Translate), image recognition (spurred by the
ImageNet training database) as commercialized by Google Image Search, and in
game -playing systems such as AlphaZero (chess champion) and AlphaGo (go
champion), and Autonomous Driving.
Year 1997: IBM Deep Blues Triumph
Year 2000: Google Search uses AI
Year 2002: AI in Homes Roomba
Year 2005: DARPA autonomous driving challenge: Stanford Stanley
By 2006, AI had made its way into the business world. Companies like Facebook,
Twitter, and Netflix began using AI algorithms to improve user experience,
personalized content, and recommendation systems.
Year 2006: Neural Networks into Deep Learning
Year 2009: Introduction to ImageNet Current AI Spring
Year 2009 -2015: Google self -driving car
Year 2012: AlexNet
Year 2012: Google Now Predictive AI
Year 2013: Deep Learning used to Understand words (Word2Vec)
Year 2014: AlphaGo
Year 2015: TensorFlow was built for DL
Year 2016: DeepMinds AlphaGO defeated Champion
Year 2015: Tesla Autopilot
Year 2018: Waymo(Self Driving Car)
Year 2022: the release of OpenAI's AI chatbot ChatGPT has reinvigorated the discussion
about artificial intelligence and its effects on the world.
New AI winter could be triggered by overly ambitious or unrealistic promises by prominent
AI scientists or overpromising on the part of commercial vendors. Deep Learning
Fathers of the Deep Learning Revolution
Receive ACM A.M. Turing Award
Bengio , Hinton and LeCun Ushered in Major
Breakthroughs in Artificial Intelligence
https://www.acm.org/media -
center/2019/march/turing -award -2018
Yoshua Bengio is a Professor at the
University of Montreal
Geoffrey Hinton is VP and Engineering
Fellow of Google, Chief Scientific Adviser
of The Vector Institute and
a University Professor Emeritus at the
University of Toronto.
Yann LeCun is Silver Professor of the
Courant Institute of Mathematical
Sciences at New York University, and VP
and Chief AI Scientist at Facebook AlexNet won the ImageNet challenge in 2012
Geoffrey Hinton and his students
Ilya Sutskever (MS 2007, Phd 2013)
In 2012, Sutskever built AlexNet in collaboration
with Hinton and Alex Krizhevsky .
Krizhevsky and Sutskever joined Hinton's
new research company DNNResearch , a
spinoff of Hinton's research group. In March
2013, Google acquired DNNResearch for $5
million, shortly after winning the contest (after
awarding the team of three $600,000 for their
work in neural networks and language and
image processing) Google Brain
Baidu, Microsoft, DeepMind (acquired by
Google in 2014) all want to buy
a University of Toronto startup that studies neural
networks. The one -year -old company is launched
by computer science professor Geoffrey Hinton
(right) and two of his graduate students, Alex
Krizhevsky and Ilya Sutskever (left). AlexNet won the ImageNet challenge in 2012
Geoffrey Hinton and his students
Hinton retired from Google in 2023.
Alex left Google in September 2017 after losing interest in the work, to work at
the company Dessa in support of new deep -learning techniques. He is the
creator of the CIFAR -10 and CIFAR -100 datasets
At the end of 2015, Ilya Sutskever left Google to become cofounder and chief
scientist of the newly founded non -profit organization OpenAI
The actual collected total amount of contributions of OpenAI was only $130 million until 2019,
$100 million from Musk.
Sutskever was formerly one of the six board members of the non -profit entity which controls
OpenAI
In 2023, people speculated that the firing of Sam Altman in part resulted from a conflict over
the extent to which the company should commit to AI safety.
Following these events, Sutskever stepped down from the board of OpenAI Deep Learning
From basic Neuron to AlexNet Neural Network
Neural Network
Neural networks are a class of machine learning algorithms used to model
complex patterns in datasets using multiple hidden layers and non -linear
activation functions.
A neural network takes an input, passes it through multiple layers of
hidden neurons (mini -functions with unique coefficients that must be
learned), and outputs a prediction representing the combined input of all
the neurons. 31
# Thank
# You
Address:
ENG257, SJSU
Email Address:
[email protected]