Transcript for:
Introduction to Cloud Computing and Data Analytics

[Music] there's something happening right now everywhere in the world that's changing all of our lives in every way it's cloud computing how can cloud computing make that much change by connecting us people with data quickly easily and anywhere at any time it affects how we communicate work shop plan and even how we relax and have fun the cloud is changing people's lives and it's also completely reshaping and improving business these days data is the Cornerstone of all kinds of organizations they depend on Non-Stop information about sales transactions consumer feedback inventory and purchase orders customer service interactions market research statistics and so much more uninterrupted access to business data is a must for organizations which creates another must people who can assess that data and put it to work and this is why the demand for cloud data analytics professionals keeps growing and growing we need people like you to help organizations understand their customers collaborate with Partners strategize for the future mitigate risk and become more flexible and resilient the content in this program will equip you with the knowledge and skills required of entrylevel roles in the field of cloud data analytics hi I'm Joey here at Google I am an analytics manager this means that I lead a team of of business analysts whose job is to provide data-backed insights to inform key business decisions I'm so happy to welcome you to the program I'm your course one instructor and I'll be by your side the whole way through this course I grew up in a single parent household in the Los Angeles area with strong roots in my Mexican-American Heritage living in a big diverse complicated City like La while challenging at times definitely instilled in me a passion for connecting with diverse groups and helped me build interpersonal skills that have been super valuable in life and in my career my career path wasn't linear or planned but through an internship and an early career rotational program I discovered a passion for data analytics and specifically using data to help people make better decisions or gain knowledge they wouldn't otherwise have as an analyst I want to show folks that data is for everyone and make technical work less intimidating and more approachable to all the program is divided into courses based on different cloud data analytics processes the course topics include an introduction to cloud computing in data analytics cloud storage and data management data processing and Analysis in the cloud and visualization of data in the cloud I encourage you to complete the courses in order as each topic Builds on what you've learned before the final course is the Capstone it's a great opportunity to demonstrate the Knowledge and Skills you gain throughout your academic Journey we've got videos and readings to teach you cloud data analytics Concepts and skills then interactive activities and Labs will let you practice those Concepts and skills you can take the labs more than once so if you hit some trouble spots just keep at it you'll also have quizzes to confirm your understanding and glossies to help you prepare to do your very best and career resources including resum and interview prep will help you prepare to apply for jobs and impress hiring managers you'll hear from googlers like me working in cloud computing we'll give you an inside perspective at what it's like in our industry and share personal stories of how we got into the field some of these googlers are going to join me in guiding you through the courses let's take a second to meet them now hello I'm Eric and I'm a product Analyst at Google in course 2 you'll explore how data is structured and organized you'll gain a experience with data Lakehouse architecture and Cloud components like big query Google cloud storage and data proc to efficiently store and analyze and process large data sets next you'll meet my colleague Alex hey there I'm Alex and I'm a data analytics customer engineer I'm really looking forward to spending time with you as you learn all about the data Journey from collection to insights you'll explore data transformation and practice strategies to transform real data sets to to solve business needs hi I'm CJ and I work in data analytics here at Google I'll guide you through the key stages of visualizing data in the cloud storytelling planning exploring data designing visualizations and collaborating with data you'll use looker to create data visualizations and build dashboards I'm Christine your course 5 instructor together we'll put everything you learn from across courses 1 through 4 into action in a caps project and you'll create impressive work examples to share with future employers all of us are thrilled to introduce you to the fascinating and rewarding field of cloud data analytics so let's get you started on your Cloud Journey not too long ago when a company stored data or ran programs it needed a huge room filled with a bunch of gigantic noisy computers humming away right there in the office but in the 1960s a group of Engineers asks what if we share computing power among many users so not everyone needs their own computer fast forward a few decades and here we are with remote data centers ready to store our data run our apps help us with analysis and so much more in this course you'll start your journey into the world of cloud computing and gain the fundamental knowledge you need to be successful in the field whether you're a cloud newcomer or seeking to level up your Cloud skills you've come to the right place this course will provide you with a solid foundation in key Concepts skills and tools used for data analysis with Google Cloud I started my career as a philosophy graduate with no professional experience unsure of how my education would translate into a job but after being exposed to a few different roles at Google I found a passion for data analysis and Engineering where a lot of entry-level tasks were like many logic puzzles that I was paid to solve I looked forward to the technical challenges that the role offered which provided the building blocks for my current career path I first learned to write SQL in my role as an HR Analyst at Google one of my first responsibilities was as a primary responder on our team's ticket queue each day we'd receive requests from internal business partners to produce data reports usually in the form of big spreadsheets with custom logic based on real business questions and problems I really enjoyed the process of fulfilling these requests starting with transforming the business requests into an analytics problem using SQL to mold the data into an answer and providing a data set that was understandable and approachable to our non-technical users it felt great to offer a service to our users and give them information that they couldn't otherwise obtain it was fun as more organizations adopt cloud-based Solutions there's a growing need for skilled Cloud professionals to help them make the transformation to get you on your way I'll introduce you to the program let you know what to expect moving forward and share some great tips for successfully completing the certificate you'll learn how to define cloud computing identify its components and differentiate between cloud and traditional Computing you'll explore cloud data analysis compared to on premises physical data analysis and you'll learn about the impact of cloud data analytics on all kinds of businesses with a special focus on the Google Cloud architecture framework then you'll discover the inner workings of data management and the data life cycle and the cloud data analyst role in keeping both running smoothly you'll also explore how Cloud professionals collaborate to create some really cool business projects together finally you'll discover key tools in the cloud data analyst toolbox and learn about the importance of Process Management in cloud computing as you progress you'll be introduced to Google cloud-based tools including big query and data proc and after completing this course you'll know about cloud data tools and be able to understand and communicate Cloud benefits share timely insights and so much more I'm so excited to be part of your cloud data analytics EXP exploration and I'm here to guide you every step of the way let's keep the momentum going and head on over to the next topic hey there coming up we have many exciting things to discover about the cloud here's a quick breakdown first we'll check out the basics of cloud data analytics you'll learn about its history and explore the many benefits of cloud computing after that you'll consider the differences between cloud computing and traditional Computing this includes their Network infrastructures def characteristics and advantages and limitations next you'll tap into the program resources so you can make a plan to be successful and career ready to wrap up peruse the glossery with key terms and definitions meet you again soon my name is Ben and I'm the senior vice president of learning and sustainability at Google that always been interested in learning because for me my mom my mother was a school teacher and I felt that um learning is really what enables people to reach a different point than they otherwise would uh I know it enabled me to go to a place I could not have dreamt of being were it not for the education I got and I think that's incredibly uh important for people to have access to that kind of opportunity and growing up in India I did have access to a good school I did not come from wealthy family but had access to a good school and I saw the difference that made in my life and I think today today with the help of Technology we can hopefully bring more of that opportunity to more people in the world the cloud is really important because it's a trajectory of where computation is going if you think about all of the major uh products that you use almost all of them are now based in some uh online Cloud uh Data Center and they have access to all these amazing Computing resources and they enable you to these these services to really provide amazing things for their users so starting Cloud technology enables you to participate in that whole economy of jobs and of opportunities that consist of building these powerful facilities in the cloud that are being used by people around the world one of the really interesting ways in which education is evolving is allowing people to build and learn individual skills through various Skilling courses many aspects of Education are not available to everybody everywhere unfortunately but it's possible to build the basic skills that one could get that one needs from an education more peace meal today and I think the approach of Skilling can allow people to build up the pieces of the education that they really need in the way that they have access to in a way that they have time for in a way that they have the resources for the initial parts of learning anything are learning the basics and the fundamentals whether it is a sport or a or or or a physical skill like carpentry whatever the first steps are learning the basics so persevere with it and it'll get really interesting I've been working with computers for what is it now over 30 35 years and it is still fascinating every day hello Cloud enthusiasts get ready to learn exactly what cloud computing is all about including how it works the components of a cloud infrastructure and different cloud service models so first up what exactly is cloud computing cloud computing is the practice of using OnDemand Computing resources as Services hosted over the Internet over the Internet is what makes up the cloud part it eliminates the need for organizations to find set up or manage resources themselves and they only pay for what they use cloud computing uses a network to connect users to a cloud platform this is a virtual space where they can access and borrow Computing Services a primary computer handles all communication between devices and servers to share information and there are privacy and security measures to keep everything safe here's another way to think about it picture cloud computing like a shared kitchen in a rental apartment owned by a property management company or in the case of the cloud a third-party service host the kitchen has many appliances and cooking tools just as the cloud platform has servers storage hardware and software so when someone in the apartment gets hungry they just go ahead and cook a meal in the well equipped kitchen they don't each have to buy their own Wooden Spoons or toaster oven likewise the cloud enables organizations to access Computing resources on demand without spending time and money buying and maintaining their own storage hardware and software it's the unique infrastructure of a cloud computing model that makes all of this possible this infrastructure has four main components Hardware storage Network and virtualization let's check out Hardware first types of Hardware include servers processors and memory Network switches routers and cables firewalls and load balancers cooling systems and power supplies these are the physical items needed to keep things running now data storage in a cloud computing infrastructure can occur in three main ways file object or block file storage keeps data in one place and organizes it in a simple easy to understand way through a hierarchy of files and folders this is the oldest and most widely used data storage system but it's a bit cumbersome and can only accomplish so much next object storage holds unstructured data along with its metadata metadata is just data about data for example a picture taken with a smartphone might contain information about about the location the date and the type of device that captured the image this is really useful for understanding the photo just as metadata explains what its own data is all about lastly block storage divides large volumes of data into smaller pieces optimally organized with unique labels an advantage of block storage is that the data is easily accessible but it can be expensive and has limited capability to handle metadata all right now we have the network after all cloud computing infrastructure needs a way to connect its back-end resources and this connection is made possible through a network of the physical Hardware through this network users tap into Cloud resources using some of the hardware mentioned earlier including routers and firewalls basically the physical network setup is what enables the virtual one to operate finally there is virtualization which is a technology that creates a virtual version of physical infrastructure like servers storage and networks this is what lets the service work without a connection here at Google we have many data centers a data center is a physical building that contains servers computer systems and Associated components these facilities provide a centralized location for vast amounts of data and skilled Cloud analysts access this valuable information right through the cloud for all sorts of business tasks and projects Cloud analysts select and extract relevant data then prepare it for processing and examination they know how to expertly analyze visualize and share data discoveries to uncover valuable insights and make smart business decisions so it's really important to know that there are three primary models to choose from each offers a different level of flexibility and control these these models are infrastructure as a service or IAS platform as a service paas and software as a service saas first IAS is a cloud computing model that offers on demand access to information technology or it infrastructure services including Hardware storage Network and virtualization tools with IAS a ser service provider hosts maintains and updates the infrastructure your organization would manage everything else your operating system your data and your applications an IAS model provides the highest level of control over your it resources and works a lot like traditional on premises it an example of an IAS model is cloud storage like emails you've sorted into an online folder here's another example when someone leases a car it's like they're borrowing it for a while having fun driving it around but they have to give it back when the lease agreement is up well IAS is kind of like that a user picks the infrastructure they want uses it for the contracted period but they do not own it next paas provides hardware and software tools to create an environment for the development of cloud applications simplifying the application development process paa s is all about helping users build apps so your organization would enjoy being able to fully focus on app development without the burden of managing and maintaining the underlying infrastructure your developers would create test troubleshoot launch host and maintain your app all on the platform paas is like hopping into a taxi and telling the driver where to take you you're not behind the wheel but you trust the driver to get you to to where you need to be lastly saas provides users with the licensed subscription to a complete software package this includes the infrastructure maintenance and updates and the application itself other users also have access to use the same services using SAS you just connect to the app through the internet think of SAS like riding the bus you pick your stop from routes that are set already and you share the bus ride with other people remember remember that these examples are meant to demonstrate the level of individual customization in I AAS P AAS and saas they do not refer to any actual Hardware or software details woo we've covered a lot about cloud computing infrastructure and service models I think we've earned ourselves a study snack I'm going to go cook something up in my cloud kitchen catch you later I love today's module because I get to do one of my most favorite things nerd out about the cloud I'm a huge fan but I also know that as a cloud data professional my enthusiasm level may come on a little strong for folks who don't have a cloud computing background that's why it's important to really understand the cloud and its many advantages so that you can explain it clearly to others in a way that is easy to understand and hopefully exciting let's first learn about accessibility one of the big advant ages of a cloud computing model is that organizations can access and manage data software storage and Cloud infrastructure from any location at any time through the internet they don't need to be physically present where the hardware and software are installed and they don't need their cloud service providers assistance when they need more data next is scalability which means to easily expand or upgrade Computing resources to meet changing needs scalability eliminates physical Computing limit limitations now the benefit of cost savings is pretty straightforward organizations only pay for the Computing resources used in a cloud computing model organizations get what's called a measured service similar to household utilities like electricity and water users are charged only for what they use based on the number of transactions the storage volume and the amount of data transferred this helps make all kinds of business initiatives more profitable and sustainable able the advantage of security is also pretty straightforward with cloud computing an organization's systems data and Computing resources are protected from theft damage loss and unauthorized use cloud computing security is generally recognized as stronger than the security of a traditional Network infrastructure this is because data is located in data centers that few people have access to plus the information stored on cloud servers is encrypted meaning it's not something that easily broken into okay moving on to efficiency there's a lot that's efficient about cloud computing but one of the main advantages is that organizations can provide immediate access to new and upgraded applications there's no time wasted worrying about the state of network infrastructure or going through a costly or timec consuming implementation process there are tons of amazing things about the cloud and now we've come to freeing resources so users can focus on more value added tasks in the cloud field re refer to this as manage services a managed service involves a third-party provider taking care of the ongoing maintenance management and support of an organization's Cloud infrastructure and applications this in turn gives users lots more time to focus on other work it's like a mechanic who automatically comes to you for annual inspections and services rather than you spending many hours in a mechanic shop waiting for services that's because all of the ongoing maintenance and management from the cloud happens automatically in the background a user doesn't have to initiate it because cloud computing is super versatile it offers a wide range of common uses including Disaster Recovery data storage and large-scale data analysis that provides users with significant benefits Let's Start With Disaster Recovery using cloud computing in Disaster Recovery means having having access to more data centers to ensure that data and information is safe and secure during an emergency the next benefit is data storage data storage helps streamline data centers by storing large volumes of data which enables easier access to the data analysis of the data and backup of the data then we have largescale data analysis largescale analysis offers easy and quick access to multiple data sources and intuitive user interfaces to query and explore exp the data this speeds up the overall process of discovering datadriven insights isn't cloud computing amazing users can say goodbye to the limitations of traditional data storage and Computing methods and enjoy the world of advantages that it offers and data analysts can help users seize these advantages with expert cloud data analysis skills when cloud computing was first introduced many people resisted the idea of losing physical control over important files cherished photos and all sorts of other data people were used to keeping these things close by under their own roofs so let's use those cherished photos as an example putting a photo in an album that you keep on your bookshelf does offer control convenience and security to a certain extent control can be limited by resources you need money materials and time to print out a photo or purchase a frame or album you can also only fit so many physical items in your space as far as security goes well that physical momental could be damaged now let's think about how we can enjoy that photo if it's in the cloud you can view and share it anytime anywhere and if you still want a physical copy you can make one and feel comfortable knowing that you have backup just in case choosing between traditional and cloud computing is also a trade-off both have advantages and limitations and both both can have a place in business depending on what the priorities are in this video you'll learn about traditional Computing how it works and its defining characteristics you'll then compare traditional and cloud computing which will be valuable to know in the role as a cloud data analyst so what's traditional Computing traditional Computing is a Computing model that enables data storage access and management through the use of physical hardware and software within a network infrastructure typically located on premises here's how that all comes together first Hardware is set up in a dedicated space or room by it professionals next the required software operating systems applications and security tools are purchased and installed once the hardware and software operational IT personnel are responsible for maintaining and managing the entire system this infrastructure gives an organization sole control and access to its data and equipment so with traditional Computing everything you need is located in one location on premises and can't be accessed anywhere else this defining characteristic offers four key advantages control security compliance and no Reliance on the internet let's explore each of these first with traditional Computing organizations have full control over their Hardware software and data they can customize their localized Network infrastructure to meet their specific needs and because of this control users usually feel more confident about the second Advantage Security if properly maintained this is because they have sole access to their systems and sensitive information third traditional Computing might be the only viable option if a business is in an industry that requires data to be stored on premises this is an example of compliance which means that a company must follow certain regulations rules and laws in this case ones that deal with data security lastly traditional Computing does not rely on an internet connection when users want to access the network or the data it contains so important information can be accessed even if internet service goes down but just as with our photo album example Le there are some downsides first with the traditional Computing system data access is limited to the device and location where the hardware and software are installed also scaling up in a traditional Computing model is challenging software limitations the time needed to purchase and set up hardware and the physical space required make it difficult to scale and expensive besides scaling up expenses traditional Computing involves buying Hardware and software plus ongoing maintenance of network infrastructure lastly traditional Computing can be inefficient as each user software must be purchased rather than shared and again this software is not automatically updated these are just some of the reasons why many organizations are moving to the cloud for their Computing needs the cloud is more accessible scalable and offers tons of savings it's also super secure efficient and freeze up staff to work on more projects it's Picture Perfect get it thanks for joining me as we venture into the wide world of cloud data warehousing there's so much data out there it's truly dizzzy so it's no surprise that businesses have struggled to figure out where to keep it all the fact is traditional databases struggled to keep up with the evolving demands of data analytics luckily cloud data warehouses are emerging to fill the need how do they do it well that's what we're going to learn about in this video first a cloud data warehouse is a large-scale data storage solution hosted on remote servers by a cloud service provider to understand this better picture it like a huge Warehouse where large amounts of different types of containers from various places are stored the cloud data warehouse can collect store integrate and analyze data there are many advantages to this structure cloud data warehouses are typically fully managed by the cloud provider this means that the cloud provider takes care of various operational tasks and maintenance allowing users to focus on utilizing the data and Gathering insights rather than handling the underlying infrastructure this saves time money and resources cloud data warehouses also have more uptime compared to on premises data warehouses uptime is the amount of time a machine is operational and of course only working computers have the ability to scale and support increased demands for data next Cloud warehouses can integrate separated data by gathering data from various structured sources within an organization like Sales Systems email lists and websites and pulling it all into one place this integrated data then can be analyzed for some pretty exciting and useful business insights another big Advantage is that cloud data warehouses provide real-time analytics ensuring users have quick access to the latest information and in business being fast is usually the key to outperforming the competition cloud data warehouses also offer some really cool artificial intelligence or Ai and machine learning or ml capabilities and when you apply Ai and ml to your data analysis this really Powers up the possibilities my team worked on a recent project where we built a predictive model to help Google anticipate demand for office amenities such as its cafes and help save money and reduce waste using ml tools we were able to test over 30 factors across months and months of data and build a model that could forecast Demand with enough accuracy and time to allow on the ground teams to adjust accordingly pretty cool right last but not least cloud data warehouses enable custom reporting and Analysis this means that users can analyze and generate reports specifically from historical data because it is stored on a separate server from data related to current business transactions and day-to-day operations as you've probably figured out the types and amounts of data that companies need to organize are only growing which means so is the demand for data storage luckily cloud data warehouses are up for the challenge with the added benefits of management and Analysis to make it even easier to use the data you have okay data enthusiasts now that we know what cloud-based data warehouses are we should probably figure out which one suits our needs and I've got a great one to introduce to you meet bigquery Google's Powerhouse of storage and Analysis an organization's data is vital to its business success and data warehousing helps make the most to that data by providing quick and easy access to information which leads to ideas insights and best of all datadriven decision making big query is a data warehouse on Google Cloud that helps users store and analyze data right within big query they can query data filter large data sets aggregate results and perform some really complex operations big query works with SQL or structured query l language this is a computer programming language used to communicate with the database it allows users to search through massive amounts of data and find information they are searching for incredibly quickly using Google infrastructure as a cloud data analyst big query's integrated SQL interface and machine learning capabilities will help you discover Implement and manage data tools to inform critical business decisions the output of your work in bigquery can integrate with typical business intelligence tools or spreadsheets but there's a lot more to explore another feature is bigquery's ability to easily migrate existing data warehouses from other cloud service providers this is a huge timesaver one of my favorite things about big query is its dry run parameter this lets you check your query thought process and plan before actually running it and big query would tell you the number of bytes the query will run so you can estimate the cost before actually queering the database it's like a practice swing engulf to help you make sure your ball goes in the hole you can also use big query to store explore and run queries on data gathered from server sensors and other devices in scheduled queries can be used to automatically refresh data and keep tables up to date data can be updated hourly daily or weekly so you'll deliver the most dynamic timely metrics to your stakeholders on my team we use big query almost daily to query transform and report on data using big query we're able to tap into a multitude of data sources which allow us to support our users with the most relevant insights we use SQL to join data sets and transform the data creating tables and charts that provide answers and when we land on an answer that can be useful in the future we scale it building self-service reports and dashboards that allow users to retrieve the same answer over and over again in a timely manner for my team big query helps us create a bridge between the data that exists and the problems folks are trying to solve or questions they're trying to answer with its smooth integration with other tools userfriendly interface and the use of SQL for Effective programming big query makes a discovery of valuable information within complex data sets simple and productive it's an essential part of the cloud data analyst toolbox there's tons to explore so have some big fun getting to know big query this will be an invaluable tool for your Cloud career it's been a blast introducing you to the field of cloud data analytics you've learned that cloud computing is an advanced and Powerful Computing model that resolves many limitations of traditional Computing it also addresses evolving data Computing needs of people and businesses all across the globe cloud computing provides on demand availability of computing resources as Services over the Internet which offers accessibility scalability cost savings security and efficiency and it frees up your time and resources so you and your colleagues can focus on the kinds of tasks that bring more value to your team and organization you began this course with an introduction to cloud computing you then learned about its history current defining characteristics and the advantages of using a cloud computing model next you examine the differences between cloud computing and traditional Computing like a physical Network infrastructure and traditional Computing versus a cloud Network infrastructure in cloud computing you've got this and remember to celebrate your hard work in a favorite way a yummy snack a comfort show or a touchdown [Applause] celebration hi there welcome to the next section where you'll continue learning about data analytics in the cloud in in this section you'll explore migrating data to the cloud from on premises systems and together we'll get into the differences between on premises hybrid and cloud data system architectures you'll also learn a lot about the Google Cloud architecture framework throughout you'll witness Cloud's impact on data analytics and many other Industries and you'll check out strategies for cloud cost optimization and its benefits for users you'll also explore the cost of storage running queries and resource provisioning along with the different billing models this information will help guide your future employer towards the most costeffective Cloud solution for their particular needs meet you in the next video anyone who's been through a move knows that it's a lot of work there's emptying shelves sorting items for packing or donation carefully boxing everything up loading the boxes into a moving van and then you have to unpack and get everything organized in its proper place once you get to the new place an organization migrating its physical Computing infrastructure to the cloud also requires careful planning and a great amount of effort to ensure a successful move fortunately third-party cloud service providers like Google Cloud can help make everything easier in this video we'll discuss the process of migrating an on-premises Computing Network infrastructure to a cloud platform you'll learn the steps to follow cloud data migration strategies and important factors to consider during migration all right the first step is to think about some key factors these include choosing the right Cloud environment for your organization then think about how much data will be transferred to the cloud this is important because large amounts of data can take a long time to move which can delay operations next consider how much downtime your organization can deal with during migration obviously no business wants to shut down their systems any longer than necessary so this decision should be agreed on by all stakeholders the next step is to choose your migration strategy options include rehosting also called lift and shift re-platforming repurchasing refactoring or retiring let's break those down rehosting is a cloud migration strategy that involves moving an entire on premises system to the cloud without changing anything else about the system an exact copy of the current setup is created in the cloud which helps the organization quickly achieve a return on investment as they use the enhanced efficiency of their operations the robust and reliable nature of the cloud infrastructure and the Innovative Technologies that are built into cloudbased Solutions rep platforming is a cloud migration strategy that involves making small changes to the on-premises system once it's migrated to the cloud so the main structure of the system's applications remain the same but a few things are improved for better performance repurchasing is a cloud migration strategy that involves moving applications to a new cloud-based service platform usually a software as service platform the cloud service will be an allnew experience so this requires some team member training refactoring is a cloud migration strategy that involves building allnew applications from scratch and discarding old applications this is ideal when organizations need new features like serverless Compu in that their current systems don't have retiring occurs when applications that are no longer useful are turned off the next step in the migration process is choosing your migration partner as a cloud professional you'll want to help your organization find a cloud service provider that offers the right infrastructure for your particular business that offers valuable services and tools and that invests in development to keep things fresh it's also important to examine the provider's customer support and service level agreement or SLA so you get reliable impr prompt support obviously I'm a big fan of what we do here at Google especially how we help our partners prepare for cloud migration success we've got something called the Google Cloud adoption framework which helps users assess their organization's Readiness to adopt Cloud Technologies this framework acts like a map from current capabilities to their ideal Cloud destination the Google Cloud adoption framework evaluates four themes first learn refers to the quality and scale of an organization's learning programs lead describes the level of support from leadership given to an organization's it department when migrating to Google Cloud scale is the extent to which an organization uses cloud-based services and how much automation they need to manage their system and secure ensures an organization's ability to protect their Cloud environment from unauthorized and inappropriate access Google Cloud provides a migration path that also has four phases assess plan deploy and optimize in the assess phase users perform a thorough review of their existing Network infrastructure then in the plan phase users set up a basic Cloud infrastructure on Google Cloud where their workloads will exist when it's time to deploy the workloads are actually moved to Google Cloud lastly the optimized phase is when organizations begin using cloud-based Tech Technologies and features and this is where they start to really enjoy the improved accessibility scalability cost savings security and efficiency like any move Cloud migration requires careful planning as a cloud data analytics professional the ability to help your organization follow the steps in this video will help ensure everything gets to where it needs to be so you can enjoy your beautiful new place in the cloud welcome to this intro to Cloud deployment models we're going to explore how to advise any organization in choosing the right environment for their unique business needs there are three primary models public clouds private clouds and hybrid clouds as a cloud data analytics professional it will be important to understand how each works then you can help your organization select a model that's flexible adaptable and helps users quickly respond and adjust to changing conditions first up a public cloud is a cloud model that delivers Computing storage and network resources through the internet in this model these resources are shared among multiple users and organizations granting them on demand access and utilization public cloud services are overseen and maintained by third-party cloud service providers who not only manage the infrastructure but also operate their own data centers next a private cloud is a cloud model that dedicates all Cloud resources to a single user or organization and is created manage and own within on premises data centers finally hybrid clouds are a combination of the public and private models they enable organizations to enjoy both cloud services and the control features of on premises Cloud models think of these three Cloud models as different ways to heat a building public clouds are like an electricity company that delivers the power use use to generate heat you can choose to turn it up when you're chilly or turn it off when it's warm outside you only pay for the power used and as a customer you don't need to worry about the maintenance of the power lines and generators private clouds are like having your own solar panels to generate power and heat you need to buy the panels have a place to install them and you're responsible for their proper care and maintenance in hybrid clouds are like using the services of an electricity company but also owning and using solar panels this gives you more options over how heat is delivered when the temperatures drop you can choose when the power company is right to use and when solar panels are the better option okay now let's discuss the advantages and disadvantages of each model with public clouds you pick and choose the resources you need and pay only for what you use public clouds can easily scale up or down based on demand and there are no maintenance worries because the cloud service provider handles all of that for you another key Advantage is reliability public clouds have vast networks of servers and can quickly redirect resources in an emergency speed and ease of deployment are also benefits of a public Cloud Model adoption occurs faster and more simply because the cloud infrastructure is already in place lastly public clouds offer new services and frequent updates that enable users to benefit from the latest Innovations like artificial intelligence and machine learning or AI and ml now private clouds come with higher maintenance and management costs but offer a few critical advantages the first is that as the name suggests they offer private and secure networks if protected properly without proper security measures put in place it can be vulnerable to hackers second private clouds help with any required regulations and compliance because you control where your data is stored and where Computing takes place private clouds also provide consistent performance because Hardware isn't shared among other organizations going hybrid can be a bit tricky as blending public and private models adds complexity but there are some key benefits a hybrid Cloud Model allows you to add a public cloud provider to your existing on premises infrastructure which increases cloud computing power without adding data center expenses a hybrid Cloud Model also gives you access to the latest Innovations like Ai and ml which can really be business game changers because you choose where your applications sit and where Computing happens there are key security and compliance advantages likewise hybrid Computing occurs closer to the actual users so it enables faster performance there's also greater flexibility because you can choose whichever Cloud environment is best for the specific task at hand whether your organization is best suited to a public private or hybrid Cloud Model your guidance will help choose a great option all three offer some really cool ways to advance any organization's computing power keeping track of business data used to be incredibly labor intensive from handwritten ledgers to complex filing systems to typing facts and figures into a spreadsheet collecting cleaning organizing and storing information was a huge and resource heavy task but cloud data analytics has automated and enhanced these processes making data management much more efficient and less prone to human error in this video we'll dive into how the cloud enables data from a variety of sources to be smoothly integrated creating a single source that users can access and analyze in real time first cloud data can be managed with data integration or data ingestion data integration combines data from different sources into a single usable data source this integration can happen through the ETL process or extract transform in load or the elt process extract load and transform ETL and elt are cloud-based approaches that use the power of cloud data warehouses like Google big query to transform data ETL transforms the data before it's loaded into the warehouse and elt transforms it after but either way it's ready for further process ing or analysis data ingestion obtains Imports and processes data for later use or storage the information is obtained from various sources and processed through stream or batch ingestion stream ingestion involves realtime continuous processing of data as soon as it is collected from various sources batch ingestion processes data in predefined intervals or larger chunks those are just a few of the ways cloud data analy itics has transformed how organizations access their data there are also web interfaces application programming interfaces or apis SQL other ingestion tools like Pub sub and business intelligence Solutions like looker and Jupiter notebooks all of these help users Access Data that's stored in the cloud anywhere anytime and while we're discussing data in the cloud cloud data analytics also makes it possible to store different types of data like files objects or blocks file data is information that's stored in a file on your computer or another storage device object data is a piece of information with a unique identifier which you can find no matter where it's stored and block data is a piece of information that has been cut from a larger piece of information and given its own file path so many data analysis activities have greatly benefited from cloud data analytics processes big Big Data analysis the ability for visualization of multiple data sources asynchronously Ai and ml custom report analysis data mining data science the list goes on and on in today's world Innovation is the driving force for many companies and there's no doubt that data fuels these Innovations it's really amazing how much Cloud analytics has advanced the field of data analytics making powerful analytical tools and processes available to organizations of all kinds at the same time it enhances the analytics process making it easier faster and more cost effective for users to discover valuable insights from data hello future Cloud Pro thanks for being with me for this rundown on the key features affecting Cloud costs resource provisioning storage and running queries let's start with an example say you're headed to the market so you create a shopping list you think about what you'll need in the coming week then write down exactly those items the list helps ensure you don't overspend on items you don't need or that might go bad later well managing the cost of cloud data analytics is kind of like that the key is to be a super Savvy Shopper knowing exactly what resources you need and how much this saves money and prevents waste the first method Cloud professionals use to achieve these goals is resource provisioning this this is the process of a user selecting appropriate software and Hardware resources and the cloud service provider setting them up and managing them while in use the resource provisioning process occurs through one of three delivery models Advanced provisioning Dynamic provisioning and self-provisioning each delivery model is different based on the types of resources an organization buys how and when it receives these resources and how it pays for them in advaned provisioning the user signs a formal agreement with the cloud service provider and either pays a set price or is build monthly then the provider gathers the agreed upon resources and delivers them to the user in Dynamic provisioning resources are adjusted based on the user's changing needs and their only charge for what they use this means they can easily scale up or down based on usage demands with self-provisioning also known as Cloud self-service the user purchase resources from the cloud provider through a website or online portal and then the resources are quickly made available for the user usually within hours or even minutes although users only pay for the resources they use how they choose to receive these resources affects how much they pay payment can be arranged with one of three rates fixed pay as you go or instant purchase now let's move to storage costs storage is ranked as one of the top three Cloud expenses for many organizations and the demand for more storage capacity only continues to grow storage costs vary Based on data storage data processing and network use as the term suggests data storage is the amount of data kept in storage in the cloud profession we refer to these as buckets exactly how much an organization pays for the storage can actually change based on where those buckets are located in the world and the type or class of the data being stored for example coldline storage data is great for data you read or change once a quarter but archive storage data keeps data that is only meant for backup or archiving purposes and it's the cheapest form of data storage now data processing is the step where raw data is cleaned organized and changed into a format for easy analysis data processing can significantly affect storage costs in several ways like the more data you process the more storage space you'll need and faster data processing requires more advanced Storage Solutions finally Network use refers to the amount of data that is read or moved between storage buckets okay let's dive into the costs associated with running queries on stored data one interesting point to understand is that most cloud service providers charge for query processes based on the amount of data processed not the amount returned after the query is run so every time a cloud data analyst runs a query to retrieve update or analyze data a bill is generated based on the total amount of data that query processes with Google big query for example there's a choice of two pricing models for running queries OnDemand pricing and capacity pricing OnDemand pricing is based on the amount of data in bytes that each query processes the ond demand pricing model is ideal for users whose compute and storage needs fluctuate based Bas on business priorities in capacity pricing is based on the computing power used to run queries over time measured in Virtual central processing units called slots the capacity pricing model is ideal for users who want a more predictable controllable cost for their queries as a cloud data analytics professional you can bring a ton of value to your employer by using this knowledge to get the best performance and value from costeffective cloud resources and services best of all you'll be helping them optimize resources processes and the bottom line it wasn't too long ago that patient charts were kept on premises at a doctor's office this created a limitation to information access important health information couldn't be easily shared with other Medical Offices or hospitals a patient might not know if they were upto-date on vaccinations or be able to access their medical history similarly people working on a manufacturing shop floor constantly had to count inventory Parts one by one entering the totals into a physical spreadsheet that's how they knew they had what they needed to get products manufactured for their customers but today cloud data analytics is revolutionizing Healthcare and Manufacturing in this video we'll focus on these industries and we'll also dive into the fields of education and trans portation you'll discover the Cloud's far-reaching impact and learn how you as a cloud professional can help all kinds of businesses predict the next big Trend discover patterns that lead to Innovation and make quick decisions that improve operations systems and customer satisfaction some fascinating new personalized medicine and Predictive Analytics opportunities are significantly improving patient outcomes helping people live healthier lives cloud data analytics and Healthcare lets us ask big questions about many patients data we can learn how a medical product works over time and with different people or see Trends in how often certain medical products are prescribed and this is just a snippet of the transformative advances that cloud data analytics have had within the healthcare industry now let's check out cloud data analytics in manufacturing today the manufacturing industry faces challenges like unpredictable demand fluctuations and and supply chain disruptions the industry is also emerging Innovative methods of operations and improving overall efficiency for manufacturers clients and customers the capacity for manufacturing companies to quickly adapt and respond to these conditions is more critical than ever integrating cloud data analytics into manufacturing processes is becoming an essential part of the industry the real-time analysis of massive amounts of data retrieved from manufacturing processes and client and customer interactions helps companies optimize their operations for reliability and efficiency analysis of cloud data also offers organizations the ability to build more targeted and Innovative products cloud data analytics and Manufacturing helps keep factory and Warehouse operations running smoothly for example Smart Technologies identify issues or need for repairs and flag things that don't meet product standards a a i checks quality at every step of production to keep standards High cloud data analytics also improves the customer experience by making sure manufacturers produce exactly what their customers want plus it makes the supply chain transparent you know what's happening with your customers Partners suppliers and even their suppliers again Smart Technologies can prevent and help track supply chain disruptions and it helps create products based on customer-driven data improving efficiency and increasing sales okay let's review how cloud data analytics impacts the education field technological advancements including cloud data analytics enable educational institutions to better equip Learners with the knowledge and practical skills necessary for career Readiness and typical workplace scenarios for example consider a university that wants to optimize its course offerings based on student preferences and performance data by leveraging cloud data analytics the institution can analyze enrollment patterns student feedback and academic performance metrics cloud data analytics in education enables a more thorough evaluation of the validity and effectiveness of courses and course content it also helps Educators design learning experiences for each student based on prior interactions with content next up an example from the Transportation industry Transportation companies typically cover large geographical areas and make frequent trips to and from major destinations this not only requires a large number of vehicles such as planes and ships but also careful and strategic planning by using realtime cloud data analysis for Effective decisionmaking more efficient routes can be designed that save time money and resources the cloud also helps these businesses predict possible delays optimize operations increase inas reliability and improve customer service cloud data analytics and transportation can improve transport vehicle maintenance just as in Manufacturing Technologies installed in vehicles monitor their operational status and provide maintenance alerts Smart Technologies also enable Ai and ml integration Ai and ml capabilities are used by Transportation companies to analyze historical travel data and discover valuable insights that are then use to predict Future travel Trends and finally cloud data enables Logistics planning with cloud data analytics Transportation companies can make the most of the latest virtual modeling software to help plan intelligent Logistics and efficient routes although we've only highlighted four Industries in this video cloud data analytics has had a profound impact on almost any field you can think of and its impact will only continue to grow which is why demand for cloud professionals is so so high no matter which industry you choose to pursue in your Cloud career hi there in this video we're going to dive into Cloud architecture and to do that we're heading to the shopping mall okay so picture a huge shopping mall with all kinds of stores each selling different products and services whatever you need you'll likely find it the best part as a shopper you you have access to all of these cool things without having to worry about the building's utilities maintenance or security the mall handles all of that well Cloud architecture is like a shopping mall thinking about this in business an organization may need to store files that's no problem need a lot of computing power easy as can be want to analyze data right this way and behind the scenes activities are up to the cloud provider the organization chooses the services they need and the cloud provider puts the services to work Cloud architecture is what builds the cloud and how the various components and Technologies are arranged is what lets organizations pull share and scale resources over a virtual Network the components that make up the structure include a front-end platform a back-end platform a cloud-based delivery mod model and a network front-end platforms include the parts of the architecture that users interact with the screen design apps and home or work internet networks front-end platforms allow users to connect and use Cloud Computing Services for example say a user opens the web browser on their mobile phone and edits a Google doc in this case the browser phone and Google Docs app are all front-end Cloud AR architecture components in contrast backend platforms are the components that make up the cloud itself including Computing resources Storage security mechanisms and management a primary backend component is the application this behindthescenes software is accessed by a user from the front end and then the backend component completes the task when you use an online shopping app you may browse a products first add them to your cart and then make a purchase all these actions are possible because the backend application is at work behind the scenes to complete the shopping experience another key backend component is service you can think of service as the heart of a cloud architecture because it takes care of all the tasks happening in the Computing system and manages the resources users access the third primary backend component is runtime Cloud run time Cloud provides a space for all cloud services to perform efficiently it's similar to your laptop or phone's operating system and ensures cloud services run smoothly runtime Cloud technology creates monitors for all services like applications servers where data is stored space for storing files and network connections next we have storage this is where cloud service providers keep the data to run systems different cloud service providers offer flexible and scalable storage designed to hold and organize vast amounts of data in the cloud infrastructure is probably the most well-known component of cloud architecture it's made up of important Hardware that allows the cloud to work and it keeps systems at their best including the central processing unit or CPU the graphics Processing Unit or GPU and the network devices the final primary backend component is security because more and more organizations are adopting cloud computing it's essential to ensure that everything is kept safe planning and designing security for data and networks provides critical supervision of systems stops data loss and avoids downtime Cloud architecture is the blueprint of how all these components can deliver a highly agile and scalable solution understanding Cloud architecture is an important step toward becoming a skilled Cloud professional and you're well on your way hi it's great to be with you learning about cost optimization in the cloud optimizing costs in other words saving money is something that's important to all businesses whether it's lowering the price of cloud computing or just planning and making smart decisions as an everyday example say you were heading out of town for a week you wouldn't keep all your lights on leave perishable food in the fridge or schedule important deliveries that would be a total waste of money Cloud cost optimization is a lot like that it involves planning and managing resources so you're not paying for what you don't need or spending more on Services than necessary in this video we'll Define Cloud cost optimization as it relates to using and managing Cloud resources discuss the benefits of cloud optimization and dive into some Cloud cost optimization strategies let's get get started so first what exactly is cloud cost optimization Cloud cost optimization is the process of reducing Cloud expenses by implementing cost reduction strategies but Cloud cost optimization does more than just save money for organizations it also offers ways to enhance workload efficiency and ensures Cloud resources are giving users maximum value it does this in a few key ways the first first is cost visibility this means the organization knows exactly what they spend their money on and how specific cloud services are build it's all about being able to justify why a certain amount of money was spent to achieve an operational goal the next benefit is improved performance of applications by optimizing Cloud resources organizations can guarantee that apps run smoothly which improves the user experience while reducing Cloud expenses another really critical Advantage is cutting carbon emissions like most business activities using the cloud affects the environment so as a cloud specialist you can play an important role in reducing your organization's carbon footprint by optimizing Cloud resources okay now that we've examined some cost optimization benefits let's discuss some strategies that businesses can use to ensure they're getting the best return on their Cloud Investments first right sizing this is the process of adjusting Computing resources like processing power and storage to fit the exact needs of an application or workload this optimizes usage similarly autoscaling is a cloud service that monitors applications automatically scaling up or down according to the Computing resources needed to meet user demand finally reserved instances is a cloud payment model in which an organization purchases a specific amount of resources for a certain time and receives a discount in return for this commitment organizations can use cloud resources long term when they do not expect to have to scale up or down during a specific time period Then the organization can benefit from reserved instances when implemented these cost optimization strategies significantly reduce expenses improve app efficiency cut waste increase Environmental responsibil ability and provide a better return on cloud Investments plus all of these advantages have the potential to create something any business will enjoy cost savings now that's a great way to demonstrate the value of talented cloud data analytics professionals you've learned so much already and are making great strides towards your future career you've discovered the fascinating process of migrating to the cloud and how important it is to choose the right data system environment you also learned about the Cloud's impact on both the data analytics field and other Industries then you reviewed costs and building models exploring Cloud cost optimization and its many benefits you also learned about the cost of storage running queries resource provisioning and the different building models this knowhow will help you guide any employer toward valuable savings lastly you got an overview of cloud architecture and considered how it's a blueprint for everything that's possible in the incredible Cloud catch you in the next section hi there welcome to the next section in your intro to cloud computing in data analytics these topics are all about the data life cycle from data entry to Data Destruction and everything in between to start we'll discuss the basics of data management and introduce safety and privacy considerations next we'll explore the data life cycle phases you'll experience some really cool examples of data in action then we'll check out the data analyst role in the data life cycle you'll learn about many of the different data professionals you might meet in your career and how they all fit into a data life cycle finally we'll cover strategies for controlling data life cycles including automation retention policies and holds inversion in data management there's a lot coming up and I'm thrilled to be here to guide you through the next exciting step in your Cloud Journey let's get started imagine that you have carefully organized a bookshelf expecting it to stay as you arranged it however the next day you find that someone has rearranged the books maybe borrowed the one you were saving or even replaced it with a different book it can be frustrating and confusing when you leave something in one state but soon find out that it's been changed or removed this is especially true for data for example maybe a company collects data about its users some of it's sensitive and a lot of it only applies to certain departments and what if every single employee has access to all of the company's data this definitely causes all kinds of problems some employees might mistakenly alter the data or save multiple new versions this makes it pretty easy for someone else to begin working with the wrong information at the same time other employees might inadvertently decide to delete some data when they're done with their own projects that could be catastrophic this business needs a data management plan data management is a process of establishing and communicating a clear plan for collecting storing and using data a data management plan is created to ensure that procedures for each of these steps are understood by all employes in some organizations the data management plan may also be referred to as data governance there are three main reasons why data management is so important first it ensures that collaboration is seamless all users can access data using documented procedures so the data stays within governance and compliance requirements second data management supports a strong data security program it helps set parameters to protect from data breaches or data loss and third a plan for data management helps with scalability by having a clear procedure in place as a cloud data analytics professional you might create data management plans or you could work within one that's already been established either way some key aspects to consider include access data types storage and archives let's dive into each of those first access a data management plan defines the roles of anyone who will access the data including each user's level of access this can also include the type of data they have access to it may even be as granular as certain rows or columns along the same lines the plan should also Define the data types the organization is allowed to collect like personally identifiable information or pii for example maybe only the marketing department should have access to user emails to send notifications or only analysts have access to user birth dates to segment reports by age brackets next a data management plan should include the types of storage allowed for example big query projects and Google Drive folders and backup plans for outages and finally archives procedures for archiving or deleting data should be considered and follow business guidelines and any external regulations it's also important to call allow any exceptions to Archive rules for example litigation data may be exempt from these procedures each of these elements of data management are crucial to protecting information and ensuring user privacy while data management plans vary from organization to organization they should begin by clearly defining the business objectives these inform the rest of the plan objectives help Define what type of data will be collected and which teams can access it they determine how long data should be saved and how it will be used and they help prioritize the entire team's goal to effectively share all relevant data management information now there's some key policies to follow when generating a plan first a retention policy may be created for your organization as a whole or for each individual project retention policies outline how data is archived and deleted they may be driven by compliance regulations legal requirements and General data protection regulation or gdpr guidelines next a data collection policy is an outline that creates rules around how data is collected and what resources may be used in the process the archival policy outlines where and how data is stored once an analysis project is complete and lastly a deletion policy is an outline of when and how data is permanently destroyed of course if employees don't know about the data management plan and how to access it that's just as bad as not having one at all a business needs to clearly communicate the plan long before there's any data to manage and a business should be sure to educate all users about the plan and the specific permissions and procedures that's all for now stay curious hope to continue exploring the cloud with you again soon anyone who works with data inherits the responsibility to protect private or personal data especially personally identifiable information or pii this is all about data privacy which we'll explore in this video pii and other privacy standards have legal implications so it's important to call out our discussion in this course should not be considered legal advice so what is data privacy data privacy is preserving a data subject's information anytime a data transaction occurs any organization that collects data knows that earning users trust is an essential part of working with their data as a cloud data professional one of your primary goals will be to use data honestly transparently and fairly pii is data that is unique to an individual and therefore can be used to distinguish one person from another a user's email address mailing address phone number precise location birthday and full name are all examples of pii and must be safeguarded along those same lines personal health data is Phi or protected health information Phi is Health Data that can identify an individual like information about patient demographics mental or physical health diagnoses or treatments and payment records related to health care it's also important to know about the general data protection regulations or gdpr this is privacy legislation for organizations in Europe it regulates how they can collect use and store personal data it also has requirements around reporting to increase accountability and fines for those that fail to comply a pro tip is that even if your employer isn't located in Europe the rules May apply to any business that collects data from European citizens whether complying with gdpr or trying to follow best practices in pii and P hi data security there are several ways to help your organization avoid fraud and identity theft one strategy is identity access management this is a process that gives each employee who interacts with the company's system access to specifically defined programs and data sets as a data professional you may be required to only access the bare minimum of data needed to do your job this is usually referred to as ntk or need to know accessing data just out of curiosity or for a nonvalid business reason may be prohibited another proven strategy is having internal data stewards this privacy team is in charge of keeping data access in check frequent audits within your organization are an additional security measure an audit is a formal examination of how users are accessing data in order to ensure safe and appropriate access while identifying and solving any data concerns another another way to protect data is through the use of security keys for accessing software accounts and data sets this is an authentication method that uses a physical digital signature or key to verify a user's identity before they access these resources finally encryption is another proven method this is just a process of encoding information encryption protects data by scrambling it into a unique secret code that can only be unlocked with the digital key data may be encrypted on an individual machine and unencrypted when you work with it or it may be encrypted when it's shared or sent from one machine to another having access to data especially data about people is a big responsibility so it's really important for cloud data professionals to know the best practices of data privacy and safety hello and welcome to this exploration of the data life cycle understanding how data moves through it entire life will help you as a cloud professional stay organized and keep your work as efficient as possible let's begin with an example maybe your analytics team is assigned a new project it can be tempting for everyone to want to get right to work but organizing project phases and creating a plan before you begin helps you get much better results likewise knowing who's responsible for which task creates a streamline project any data project will benefit from understanding the data life cycle this is the sequence of stages that data experiences which include plan capture manage analyze archive and Destroy let's dive into each one a bit further as its name suggests the plan stage begins before you start any analysis usually it involves answering a particular business question or meeting an objective it's important that all project outcomes are focused on this key point a typical business question is how can we increase user engagement with our product and an objective might be we will reduce our plastic use by 50% by using only recycled materials in our package design next team members decide what data types to collect which data management processes to follow and who is responsible for each stage during the planning stage the ideal project outcomes and how to measure success are defined next is the capture stage when data is collected from different sources this may include outside resources like publicly available data sets or it may be data that's generated from within the organization this stage is also when gaps are identified in an organization's current data collection and are improved through iteration for example maybe Business Leaders want to know how many users have downloaded their app but when reviewing the existing data collection plan an analyst notes that they can only capture data about how many users have created an account but not always how many app downloads have occurred the analyst may need to work with the engineering team to figure out if it's possible to get this data logged then it's time to process the captured data this is important because raw data directly from the source is usually not in a usable format for analytics so a data engineer needs to clean the data and transform the data they may also compress the data and encrypt the data for security purposes now we've come to the manage stage the goal of this stage is to ensure proper data maintenance this includes safe and secure data storage keep in mind that the manage stage is an ongoing process throughout the project in other words the raw data collected in the capture stage the transformed data from the process stage and any business Logic the analysts establish all must be managed securely on an ongoing basis next is the analyze stage this is when analysts use data to answer the business question or help meet the business objective generated during planning at this point team members review compiled data to find Trends create visualizations and suggest business recommendations based on Data Insights then in the archive stage the data engineer May store data for later use if needed finally when data is no longer useful to the organization it's destroyed although it might seem like a ways to destroy data this is an important stage to ensure that sensitive data cannot be stolen to support privacy protection guidelines and meet other compliance requirements and regulations for example this may keep data practices in compliance with gdpr the general data protection regulation in the European Union this regulation covers the data that an organization is allowed to hold okay one final point about the data life cycle none of these stages work without data team members you just learned that data analysts primarily work during the analysis stage they collaborate with data architects who design the plan for managing the data data engineers build the infrastructure for the data and data scientists use the data to create models to understand the data with these talented people in place your organization will have a point person for each stage of the data life cycle and everyone will have a clear idea of where projects and data are headed hello and welcome to this video about effective data reporting being able to share Data Insights with others is an essential part of data analytics after all an amazing insight to the business question is only valuable when the information is shared as a cloud data analyst you'll likely create a variety of reports to share your analysis and provide summaries for decision makers the data life cycle plan capture manage analyze archive and Destroy will help you create engaging datadriven Communications now before we go further it's important to note that the data life cycle and the data analysis process are two distinct Concepts when we think about the data life cycle this is how the data itself moves and changes throughout its existence when we think about the data analysis process this relates to how data analysts interact with the data now that we've covered that let's dive into the data life cycle let's shift to an onthe job example with the data analyst named Kayla a stakeholder at Kayla's company requests a report of all payments the company received in the last month they would like to analyze several invoices to ensure that they're not missing any payments the stakeholder also wants to see the percentage of payments that were made on time and the payment methods to generate this report Kayla will motion through the entire data life cycle the first step is the plan stage so Kayla will review the business questions and objectives then she will determine what Fields need to appear on the report Kayla considers questions like what filters are needed to extract or separate the data and how far back in time should the data go now she moves on to the the capture stage here Kayla will implement the plan she's created and assess the metric she needs to capture then Kayla will confirm the data is available and then gather the data data can come from a variety of sources in the scenario of the payments report Kayla will need user information and payment details sometimes the data you need doesn't exist if that's the case you may need to work with the product or engineering team to figure out how to fill in the gaps you'll also want to establish a reasonable timeline for the data export let's get back to Kayla's process the manage stage is all about data management in this stage Kayla considers how to store the data and prioritize security Kayla will also determine whether she's working with any pii or personally identifiable information when sensitive data is involved it's critical to take steps to protect people's privacy the report Kayla is generating will likely include user credit card numbers addresses and phone numbers if she determines this information to be necessary for her project then she will ensure it secure otherwise she will emit the pi at the start and simplify her data security needs Kayla can also ensure that the data is accessed on a needto know basis so that only people who need the data can access it in this phase Kayla can also perform quality checks to confirm no required values are null or missing in this payments report project let's say she has multiple physical stores collecting payments in addition to an online website so it's important to check that the same fields are using the same kind of data so everything combines seamlessly during table joins for example the field account number will need to match to the correct user account and both need to match in field types like string values for example C1 2 3 4 five next it's time to analyze the data in this stage Kayla uses the data to solve problems and support business goals she can do this through the collection transformation and organization of data in order to draw conclusions make predictions and drive informed decision making one of the most exciting parts of the analyze stage is sharing reports of your insights now Kayla will create visualizations to help others understand the insights she will also go back to the security protocol to ensure that she's not sharing any pii with people who should not have access next in the archive stage Kayla will determine what needs to be saved she may need to save data to share it with additional stakeholders later on or she might save it to compare with other reports finally once Kayla determines she doesn't need the data anymore she moves on into the destroy phase destroying data completes this cycle congratulations you've just experienced the whole data life cycle you've met your business objectives and kept all of your company's data safe and sound have you ever played on a sports team or experienced a live sporting event for example take the sport of basketball in basketball there are two teams each with five players the players on each team work together trying to score points by throwing a ball through a net there all basketball players but each has an individual position and distinct role to play the same is true for a data team everyone's on the team and each one plays a distinct role when completing a project in this video you'll discover the interesting jobs of people who work with data based on what you learn you may even decide to pursue one of these new career paths instead of cloud data analytics that's great whatever Direction you end up taking knowing all about these different jobs will be invaluable first up data analysts these are truly Dynamic people with a wide skill set they know how to seamlessly switch between roles on a data team acting as a data detective a translator even an artist data analysts blend the science of analytics with The Art of Storytelling they work directly with the data data analysts perform data analysis workflows including importing manipulating calculating and Reporting on business data when reporting they VIs visualize and communicate results to stakeholders which includes creating presentations and dashboards for sharing data data analysts also perform statistical analyses in working through a data cycle some of the tasks they may complete include analyzing and reviewing databases using scripting languages like python or SQL to query data creating visualizations in dashboards and programs maybe Tableau or looker and presenting findings to stakeholders data analysts also play a role in the management and sharing of data under the direction of a data architect or engineer they work with data Engineers to clean data and ensure that it's focused on answering the business questions generated during the planning stages if all of this sounds exciting to you then put on your detective's hat start translating information and choose your artist paint palette this just might be the career path for you okay let's meet the data engineer next a data engineer is a professional who transforms data into a useful format for analysis and gives it a reliable infrastructure they design create manage migrate and troubleshoot databases used by applications to store and retrieve data much of their work involves developing the structure the team uses to collect store and analyze data the data engineer is also the one who builds the data pipelines Pipelines are a series of processes that transport data from different sources to a destination for storage and Analysis Engineers also test and maintain pipelines once they have been built data Engineers are also tasked with the important job of ensuring that data is accessible to all other members of the team as permitted by their company's access policies like all members of a data team Engineers must be effective communicators they collaborate with team members and other stakeholders to ensure company objectives are met it's an exciting career path that makes a powerful impact on everyone in the organization next we'll get to know data scientists these data professionals analyze data do statistical analysis and usually build and train machine learning models the data scientist is someone who works primarily in data research identifying business questions collecting data from multiple sources organizing it and finding answers now let's consider how a data scientist and a data engineer are Al and how they differ like a data engineer a data scientist uses coding and scripting skills to clean and summarize data they differ in that an engineer typically builds a data pipeline from start to finish and a data scientist uses the data from the pipeline to draw conclusions data scientists create data analysis workflows along with data analysts and Engineers data scientists import data from various sources clean it and use data to perform calculations they also understand the business business applications of data and focus on statistics machine learning and modeling the role of data scientists is another wonderful option for your own data career all right last but definitely not least we have data Architects these professionals collaborate with data analysts data scientists and data Engineers to design the infrastructure of the database the data architect helps their team to plan the overall data life cycle of a solution the data architect will usually produce a data or solution architecture diagram to support the management of a data life cycle in collaboration with the analyst and scientists then they relay it to the data engineer to build the database every business is different but those are some of the most common data team roles however you make your way into the field keeping your career options open will allow you to grow and make your own valuable contributions hi I'm Safa I'm a product Analyst at alphabet doing work that feels meaningful is what excites me the most about my job one thing that my parents did give me their intentions in giving me a computer was for me to do well in my studies so that I can research things do my homework better but what they didn't realize was that the computer was a Vortex into online connections and I think just by virtue of spending so much time on the computer I found myself really good at Tech skills when it came time to going to college and deciding Oh shoot what do I want to study now I knew I wanted to do something with in TCH I think cloud is an exciting career space because if we're talking especially around data analytics and we're thinking about where we can host data on the cloud I mean there's going to be trillions of records of data on the cloud and we're going to need folks who have that analytics or data knack to be able to know how to report on it to be able to know how to engineer ways for that data to live on the cloud and so I think if someone is interested in being a part of the cloud especially as a data analyst I feel like there would be a lot of opportunities to mend those two ways together cloud makes it possible for a lot of people to do their job well so it is an exciting place to be at in my job especially as a data analyst it's really important to have a reliable and robust way for having our data organized for having our data able to be accessed to have an intuitive way to query and report on that information the cloud makes that possible for us because we're able to use tools like big query that allows me to query that data and find where those tables are accessed and it creates like a a platform for us to have our data warehouse on it and it's fast and it makes it a lot easier and I've been in jobs before where we didn't have something like that I think a lot of folks especially at Google cuz everyone is so smart and just so hard hard working I think we all probably are going through something or another and doubting ourselves and feeling like I'm not as smart as a person next to me I'm not as hardworking as a person next to me when I had first started at Google I mean I like many other folks I had so much impostor syndrome I thought they made a mistake hiring me because I was not the traditional person that I had attributed as someone who gets hired into Google which is the Harvard graduate the person who got amazing grades who was like valid dictorian and you know here I was High School Dropout if you want a product to reach multiple people as Google products do then you need diverse people working on it so yeah I think it's super important for folks of non-traditional backgrounds to have their voices be heard and their opinions to be inputed and woven into products because it makes for a better product a piece of advice that I would share for folks toing the certificate when moments of self-doubt come up is reminding yourself that you are incredible trusting that gut instinct that led you to this certificate in the first place is important to hear and honor I feel like for me coming back to that place of like there's a reason why I'm doing this and I know I like this stuff and I know myself well enough to know that this is just a momentary wave and if I ride this wave through I'll go back to that same place of feeling excited by this topic again business demands are always changing and as a data professional being able to adapt and respond is invaluable sometimes data professionals will wish they can multi task or be in two places at once this actually applies to many jobs for example consider a farmer when the farm is small a farmer might water crops by herself every morning as the farm grows the farmer no longer has time to water the crops as she once did she wants to figure out how to water a huge Fields worth of crops efficiently how can she possibly achieve this she invests in a timing system that Waters her crops at the same time every day for a preset interval this means that all of her crops are watered simultaneously and she harvests a record number of crops in data management you'll sometimes encounter similar situations you might need to ingest data on a set schedule or you may need to automatically cap the amount of data ingested you might be asked to be more efficient with the set amount of resources there are many ways you could be asked to scale or streamline data processing in this video we'll explore how data management systems can be implemented and automated to respond to everchanging business needs automation is the use of software scripting or machine learning to perform data analysis processes without human work for example consider a business that has over 200 employees to pay twice per month someone could verify the hours worked for all employees then another person could calculate the amount of each paycheck and a third person could write and sign the checks most businesses don't operate this way instead the payroll process is usually automated verification of hours worked and calculation of pay is all done automatically by a computer program in some cases payment is automatically transferred to each Employees bank account by an automated process automation saves time and resources and ensures that people are paid on time one of the key benefits of automation is that it reduces mistakes during data collection through automatic checks for errors or incomplete data this helps ensure your data set is complete and accurate in situations where compliance is key automation ensures uniformity and accuracy important for compliance automation scales up workloads ingesting or processing more data at once without losing key information finally automation allows for efficiency and process Improvement which can help to preserve resources both human and Technical and controls costs having a clear process for moving data through a cycle makes it easier to control your outcomes with automation added at Key areas you'll be able to move data through its life cycle quickly and efficiently now that you've automated repetitive tasks this frees up your time to focus on the meaningful work of data analysis have you ever cleaned out your closet and donated some clothes to a charity or maybe you've cleared out old pictures from your smartphone so your phone is faster and more efficient now what happens if you donate a sweater you actually meant to keep or deleted a photo that you really wish you still had whatever type of cleaning you're doing it's important to have a plan in place so you don't accidentally throw away something you need a clear process for cleaning or for deleting items makes it easier to keep track of everything in your role as a data analyst you'll likely encounter something called a data retention policy at your organization this is a key part of data management which deals with how and when data is saved or deleted data retention is the collection storage and management of data every organization that collects data should have a policy for its retention these retention policies may consider internal needs industry regulations and laws that apply to the particular business a data retention policy should clearly Define the scope of the policy and the reason for saving information it also defines whose data may be collected both internally and externally and as applicable the policy May list any laws and regulations the organization follows the policy provides a detailed description of the organization's data retention process for example a schedule for saving and deleting data rules for keeping data safe and processes for destroying data it also includes guidelines for data breaches other guidelines that may be a part of a data retention policy include the format of the data whether the organization archives or destroys data and who has the authority to make decisions about deletion now that you have an idea of what goes into the data retention policy let's review some of the regulations that may affect your data depending on the industry you work in a business may need to adhere to the following regulations or standards it's important to call out our discussion in this course should not be considered legal advice if it's a publicly traded company in the United States it must adhere to the sarbanes Oxley act for data retention sarbanes Oxley requires organizations to maintain internal control systems that protect financial data of consumers a business that accepts credit card payments must have policies that follow the payment card industry data security standard PCI DSS this standard requires organizations to build secure networks protect card holder data and Implement strong Access Control measures a business must also regularly Monitor and test all of these elements certain healthc care organizations in the United States are subject to the health insurance and portability and accountability act hiaa or Hippa Hippa establishes requirements to protect an individual's identifiable health information and any organization that processes personal data of European Union citizens even if the organization is based outside of the Europe European Union must follow the general data protection regulation or gdpr gdpr requires that reasons for personal data collection must be transparent and a business may only use data for the stated collection purpose gdpr also provides European Union citizens with certain rights and requires that businesses securely store personal data when using a cloud service like Google Cloud a data professional will usually activate a bucket lock feature a data professional can use this as a tool to create a data retention policy for each storage bucket and Define how long it will be retained Google Cloud's bucket lock also provides access to a detailed audit log which gives users a clear picture of data requests and responses this helps a business maintain compliance with policies about who accesses data and when finally object life cycle management is an automated way a data professional can manage a data retention policy in the cloud an object is data stored in a cloud object storage service the cloud service provider stores the data in large chunks with Associated metadata this management process enables a data professional to create parameters for object storage to delete objects or data or move it into different buckets with predefined parameters automat makes it easy for the business to keep track of important information you can use the strategies in this video to make your data more organized and secure you'll keep what you need for as long as you need it and get rid of what you don't to keep processes moving efficiently for Speedy Data Insights have you ever worked on a document and saved multiple versions of it along the way it can be really difficult to find the most upto-date version it's also sometimes challenging ing to keep track of a document while multiple people are working on it at the same time and worst case scenario someone might delete the dock Al together thinking it's no longer useful in this video we'll discuss Version Control and holds as effective ways to ensure a document's current version is the most accurate for all document editors in the data analytics field data professionals use versioning and holds to keep data current and make sure everyone has access to the accurate version let's start with versioning versioning is the process of creating a unique way of referring to data this may be through an identification number a query or date and time identification the main goal is to label a file or data set with the date and time that it was created this means that even if you have multiple versions of the same data you can always refer to the date and time to ensure you're working with the most recent version this is especially important if you're on a team where multiple people have have access to the same files and may be creating versions the process of versioning can also include saving a data set with a new name or saving data in a new file path with clear directions about how and where to save files you can track current versions with all of your teammates there are several benefits of versioning to help ensure a data professional can perform accurate and reliable tests on the data along with making sure you have the latest version of data versioning can also support quality control sometimes there may be an editing error or an error in data import being able to revert to a previous version allows you to return to the most recent accurate data set and it's valuable when addressing compliance concerns you can version data that lives in databases or warehouses and cycle It Out by version plus sometimes when you find an error in code you can go back to the original code to find where edits are incorrect okay now let's move on to holds a hold is a policy placed on a data set that prevents its deletion or prevents deletion capabilities for certain accounts sometimes a data set will have an organization-wide hold meaning no one can delete it there are several benefits of holds most importantly they prevent accidental deletion and preserve data indefinitely which means someone can't remove data that may be needed in the future holds policies will allow the business to determine and Define each user responsibility for data sets for example some users may have access to view and edit data but are prevented from deleting it a business may put both versioning and hold policies in place if there are security concerns about sensitive information a business may also create policies to help manage storage costs ultimately both versioning and holds support increased collaboration accuracy and Better Business results implementing these policies is just one way that data professionals help their organizations achieve data Excellence nice job you've made it to the end of this section before moving on let's review what you've learned you explore data management when working on a data project you also found out about data teams and data cycles and how they work together now let's review some specific key points you learned about the importance of data security and safety and retention and deletion policies then you explore the stages of the data life cycle and you consider the roles in the data life cycle you might play plus where your team members make their own contributions you then went on to identify how automating the data life cycle can resolve issues and boost efficiency finally you identified strategies for safeguarding and controlling data including retention policies versioning and holds congratulations on all of your progress so far hi there welcome to the next stage of your cloud data analytics journey in this section you're going to dive deeper into data analyst roles and responsibilities cloud data tools and data identity and access management you'll first check out the unique ways the data analyst role operates in cloud data analytics then you'll learn more about other key team members who interact with data this will help you collaborate like a Pro next you'll explore Google Cloud's data and analytics portfolio you'll get a highlevel introduction to the tools big query looker data proc and data flow your final section will focus on data access management you'll understand why this is an important part of the data analyst role you'll also learn about identity access management we're going to cover a lot in this section and it will be a great glimpse into the day-to-day activities of your future role in the cloud let's do it wherever your data Journey may take you the cloud is going to have a big impact on your data career to make the most of all the cloud has to offer you'll need some specialized knowledge about many of the unique aspects of data analysis in the cloud and that's what we're going to explore in this video first let's understand cloud data analytics cloud data analytics is the process of analyzing and extracting meaningful insights from large volumes of data using cloud-based services and solutions it enables data professionals to analyze data and use services and systems all hosted on the cloud cloud data analytics allows analysts to access large amounts of data from multiple sources without having to worry about setting up the infrastructure and security for each data set even for organizations that also have data on premises on physical servers with cloud data analytics major portion of their data analytics is hosted in the cloud this includes the data itself and the systems used to manipulate and analyze it let's check out some of the key advantages of working with data in the cloud one big benefit is having quick and easy access to real-time data from different sources plus analyzing large data sets is simplified because Computing and processing power don't rely on local machines Plus data professional can Aggregate and analyze data right in the cloud this is especially valuable as cloud data analytics continues to have a huge impact on website data sales data financial data and performance data roles for example if a web server is hosted in the cloud much of the data will be generated right there so it makes sense that aggregation and Analysis also happen in the cloud here are some specific ways data analytics works in the cloud sentiment analysis and customer segmentation on a sentiment analysis project you can monitor the feelings of customers employees and competitors on social media using Cloud analytics you can bring in data from all major social media platforms and create summaries of themes in feedback when working with customers you can use cloud analytics for customer segmentation to group customers by their behaviors needs and preferen with Cloud databases and tools you can automatically ingest large amounts of data and train your system to generate themes and summaries much of this is made possible by a cloud-based data pipeline through that pipeline information moves seamlessly from creating data to archiving data Pro tip in a professional setting some data analysts may never have to access databased on physical or on premises Hardware at all in your career here you'll probably build SQL data pipelines in a variety of cloud-based tools like big query you'll create visualizations in Tableau or looker and maybe use Predictive Analytics from the latest artificial intelligence offerings you might even find yourself creating exciting machine learning models of course you'll perform the typical analytics tasks of data ingestion cleaning and transformation a benefit to hosting versus on premises is that with hosting you'll have Cloud tool tools and platforms to help you with those tasks whether your future role lives on premises or in the cloud you now know how the cloud affects many facets of the data analyst's daily work this knowhow will help you maximize the Cloud's amazing tools and solutions as the cloud continues becoming a more prominent reach in the data world all kinds of exciting new data tools will arise as a data professional it's important to know that cloud data tools will usually be specific to each stage of the data life cycle and there will probably be a lot of choices for each stage so how do you choose the right solution for a particular task in this video we'll introduce you to some of the tools offered by Google and explore how they can support many types of projects and initiatives first is Big query a serverless data warehouse this tool is pretty popular in the data world because it works Works across many major Cloud platforms with big query you use SQL to query and organize data there are also some really cool machine learning and artificial intelligence tools integrated within the platform and big query includes buil-in business intelligence engines to create interactive and responsive Data Insights as a data analyst you may use big query to process seemingly unlimited amounts of data think terabytes and even more terabytes without having to manage the database another solution that may become part of your toolkit is looker looker organizes business data and builds workflows and applications based on Data Insights but it's primarily a data visualization and reporting tool as a data analyst you might use looker to integrate various file types like CSV Json and Excel into a single application you may also use it to publish data in a variety of dashboards next is data proc this service is fully managed and allows you to run Apache Hadoop Apache spark and Apache Flink along with many other open-source tools and Frameworks data proc can modernize data lakes and perform ETL functions or extract transform and load for massive amounts of data as a data professional you might use data proc to process large data work loads and place them in clusters that allow you to access your data quickly then there's data flow which gives you the ability to stream and batch process data in a serverless application data professionals may use data flow to develop data processing pipelines it can assist with reading the data from its original Source transforming it and writing it back to your database you may also work with cloud data Fusion another fully managed service that allows you to integrate multiple data sets of any size as a cloud data analyst you might use cloud data Fusion because it allows you to use a graphical user interface instead of code to manage your data pipelines data Plex also allows you to work with multiple data sources it creates a central hub for managing and monitoring data to work across various data Lakes data warehouses and data marks as a data analyst you may access datax with big query this will allow you to access data from a variety of databases and platforms with a single interface finally big lake is a storage engine that you can use to unify data warehouses and lakes with big query and open source Frameworks it also provides options for Access Control and multicloud storage these are just some of the standard tools you might encounter as a data analyst and there are even more under Horizon hello data Enthusiast I'm so happy you've joined me for this investigation in to data access controls and best practices of data management as you're about to discover a major part of data management is ensuring the data is secure and Security in this context is about where the data is stored and who has access let's begin with an example picture an office manager with a security device at their office entryway they need to provide an up-to-date access code to their staff and trusted vendors in these circumstances the off office manager can give other people the access code to allow them in in other words sharing this code means they can give access to only the people they trust and keep out those who shouldn't enter chances are you use unique identifiers like this access code all the time other types of unique identifiers include the code to access a phone or a computer data access management works in a similar way as a data professional you give specific people access to your data and your data process processes and keep out those who shouldn't have access okay first a quick definition data access management is the process of implementing features including password protection user permissions and encryption to protect data and related processes in the context of the cloud it includes the creation of individual user roles or user groups and the access for each role or group is predefined by an administrator the act of assign assigning this access is called identity access management or IM am am is the process of assigning certain individual roles or groups with access to particular resources and some Key Resources may include software hosted in the cloud data sets databases or entire data warehouses the level of access is determined by the role of the user there are three main components of identity access management principal role and policy first principal is the account or accounts that can access a resource you can think of this as the account attached to each username it's important that each account has a unique identifier the second aspect of identity access management is role this is the collection of permissions that list the elements that an account has access to each principal may have a different role and finally policy is the collection of roles that are attached to a principle let's explore how this might work in an organization as a data analyst administrator you're setting up a security group for the entire data team in the finance organization the principle is a group that contains all the individual usernames for those on the data team the roles in this group include both data writer and data reader both roles are attached to the principle the policy is both data writer and data reader attached to the principles username now universal access access by role and access by environment are the primary access types you'll encounter here's how they work universal access as the name suggests is access that everyone needs this may be universal access to an entire project or to one portion next access by roles is access for each individual role assigned to a project creating profiles by roles helps determine what access is needed for the tasks each role will perform finally access access by environment is determined by user location so a remote user may have different access than someone who's on site there are some best practices to help ensure data is accessed according to assigned permissions first be sure to audit data access and monitor user tasks second put additional permissions on remote access also verify and stop unusual activities and finally add two Factor Authentication or strong passwords to accounts to ensure security it's also a good idea to regularly check access roles users may not need as much access as originally provided alternatively a user may need increased permissions to complete a project and it's always possible that users change departments and take on new roles and need an update to their permissions along the way it's very important to document user access so you have a record of who has access and why this also helps audits go smoothly and is critical for compliance documentation consider sharing this information in a team Playbook so access issues don't become blockers to completing work with effective access policies in place you'll know you're helping your employer keep its data safe and sound welcome future data Pro thanks for joining me for this investigation into business data requests the business bu data request is the kickoff to help launch a successful project it's where all the fun starts a business data request is any business question that can be answered with data whether the request involves data that already exists or new data as a data analyst you'll typically receive this requests through a ticketing system this is a tool for recording requests and assigning them to the appropriate team member data requests can be simple or complex some can be responded to by single data analyst others will require a team of data analysts to work together to generate a large report typically data requests will vary from organization to organization but data requests will have some similar elements first requests can be internal from employees within the organization or external from outside external requests may come from government or Regulatory Agencies vendors users or clients types of requests might include answers to a data related question or the creation of a data report an extract or a dashboard and requests usually include information or structure specifics here's a case to consider drew a data analyst works for a company that runs clinical trials for a health care product stakeholders want to know how long participants spend filling out surveys during the trial so Drew receives a business data request asking for a report about service survey timing this is an internal request for a data report Drew can also potentially add the report information to a dashboard as Drew begins to work on this request there are several things he keeps in mind first the overarching goal in this case the goal is to find out how much time participants are spending on surveys and it's also a great practice to go beyond this and try to understand why the stakeholders need the data an example question Drew my ask about the goal is do the stakeholders need to correlate the length of time taken with the quality of the response then Drew will determine what's being measured in this case it's pretty straightforward the length of time participants take to finish a survey next Drew figures out what data is needed and how much does he consider all survey response times or only the surveys that launched in the last year in this case he determines he wants all surveys but segmented by survey name Drew reviews the survey completions to identify any errors or outliers in the data for example if the average survey completion time was 10 minutes and a participant took 24 hours is that a bug or did they actually take that long finally Drew searches for Trends to identify within the survey name segment for example does one particular survey take longer than the rest next he gathers the data and creates a summary report of all of the information this enables Drew to find connections and draw conclusions about how long participants spent filling out surveys during the trial engagement this is just one example of a data request and you're sure to come across lots more throughout your data career but no matter what question you're trying to answer working through these steps will help ensure that you gather the right data and use it to come up with a brilliant solution hello data detective always being curious enjoying investigation and committing to tracking down necessary information are all big parts of the data analysis process yet sometimes that information just isn't as useful as we'd hope in this video you'll gain some great strategies for figuring out how to fix this problem maybe as a data analyst you're asked to respond to a business data request this is a business question that you can answer with data your first step is to assess the request to make sure you understand what's involved and the goal or conclusion desired let's take up this case study the request involves a supply chain organization that wants to determine if Supply Partners have a successful customer experience when purchasing inventory from their online store now there are a few ways to confirm whether you have the data you need to respond to to this request you could use analytic software to check the structure of the database then determine if it is possible to learn when Supply Partners began searching for a specific product compared to when they clicked the purchase button you might also track timestamps on their visits to your website this can help confirm that partners are able to efficiently find and buy what they need and it lets you know that they're actively engaging with your website next you move on to data cleaning sometimes called Data wrangling in this phase you correct or eliminate any inaccurate or repeat records from the data set for example maybe a supply partner emits important details from an order then creates the same order again to correct their mistake you'd want to remove the duplicate also during this phase you have the opportunity to learn more about your supply partner and define specific segments and then break down the data set into smaller sections now it's time to check for any outliers or deviations in the data set if a partner mistakenly enters an order for 1,000 products but they really only need 10 you'd want to address that outlier here you also integrate multiple data sources or use data from multiple methods or sources to ensure information is accurate in this example you'd use data from your website your customer management system and your purchase orders if available you could also check summary statistics you can create these in a dashboard or spreadsheet that calculates the average mean median minimum maximum and other various relevant statistics and calculations you can also run a quick SQL query that does the same finally if you've done this type of analysis in the past it might be helpful to compare the two projects if the results vary widely this likely indicates that you need to recheck your data set and your process better yet consistent results help confirm that you're working with valuable useful data by putting on your detective's cap and viewing data from many different perspectives you can be sure you're using the best possible data your stakeholders will appreciate your investigative Spirit when their requests are fulfilled skillfully and thoroughly hi there and thanks for joining me as we explore a business case about data management and storage tools data analytics in the cloud brings with it a library of tools that can assist with each phase of the analysis process in this video you'll work through a business data request and explore the tools you might use to fulfill it in this example you're a data analyst working for an app-based gaming company after a few months on the job you receive a business data request from the advertising team they want to know how many users and what percent of users are clicking on ads during game play the advertising team will use your findings to decide whether they should continue showing ads during gameplay or instead just charge users more for the app your game app is hosted on various platforms and users are able to purchase a subscription if they want fewer ads so you'll need to gather data from multiple data sources the first data management tool you use is Google Cloud Storage this will host all data once it's compiled with this tool you can upload data from remote servers through an internet connection while working towards your business question you're able to move your application databases analytics and data science tools into one place you also use data flow in this project to create a data pipeline data flow gives you the option of either stream or batch processing this makes it possible to stream data from your various apps into one database you can also use the batch processing feature to import any data you already stored on your computer cloud data Fusion is the tool you use to build a data pipeline you can also create data Integrations Run High Volume data pipelines and continuously integrate user data from your applications into your database once you have data from all sources streaming into your Google Cloud Storage you use big query to write SQL queries to join data and clean the data to ensure it's complete has no duplicates and will be useful for your project next data proc allows you to use open-source data analytics tools at a large scale and you can apply programming languages and algorithms to data also at a large scale with these tools you're able to move your data to the next phase and start drawing conclusions after analyzing the data using the relevant tools you share your discoveries with the advertising team highlighting how many users click on ads and what percentage of total Gamers that equals with this information the advertising team can effectively choose whether to keep using ads to generate revenue or charge users a fee you've just experienced a typical business case involving data management and storage along the way you successfully integrated multiple tools to complete a data workflow and find important answers for your stakeholders cloud data professionals are excellent data stewards and with the strategies from this example case you now understand how to use key tools and solutions to protect and control all kinds of data hello future data expert welcome to this video all about data proc together we'll explore what makes it such a powerful tool and how you can implement it in order to both save time and increase the reliability of of your data let's begin with an example maybe you work for a major online retailer that sells more than 500 products and sources them from about 2,000 suppliers obviously this adds up to a lot of data both structured and unstructured this retailer is working on a strategy to stay competitive in the fast-paced world of online retail staying ahead of the competition is a top priority that's why your company is so focused on building a cuttingedge data system that revolutionizes the way they process data in customer reviews to help achieve this objective you and your data team utilize data proc to manage the unstructured data data proc is a fully managed service that maximizes open-source data tools for batch processing quering streaming and machine learning one of the benefits of data proc is that you can increase and decrease compute resources based on the project so then you don't need to guess underestimate or overestimate resources this means your online retailer can dedicate cloud computing resources that Focus only on processing customer reviews and you can have separate resources that Focus only on processing sales information you also learn that data proc Works especially with Apache Hadoop a framework that distributes the processing of data across clusters of computers this also means that we can scale up to use the power of multiple computers even thousands to process our data in real time when we need it data proc uses its open source tools on Virtual machines that scale up and down as needed and it can install a Hadoop cluster in 90 seconds let's learn a bit about how data proc scaling Works Data proc disaggregates storage and compute services so both can be created and terminated as needed this means that once you have data to store your storage will remain but if you're not actively processing data the compute services will terminate for a more specific example let's go back to our online retailer so your company is going to store external application logs in cloud storage then the data will be processed by data proc which can write it back to Google Cloud Storage or big query you can also send the data to a data science notebook for analysis or to data scientists who can build and train AI models because storage is separate from processing your organization will save money by grouping tasks so you'll be using exactly the right amount of memory and storage space to meet all your data analysis needs your company has always stored data in different servers and on some hardware-based servers you learn that data proc has pre-built templates that can help you move your data from your existing structure into Google cloud services or GCS some of the common templates include snowflake to GCS redshift to GCS S3 to Big query and Kafka to Big query this means your databases in Snowflake redshift or Amazon can be automatically moved into Data proc with these templates Plus data proc makes it easier to manage Hadoop and you can integrate organization wide Security Plus you can enable data users through Integrations giving varied permissions to different users all right now that your company has integrated data proc it's able to provide data on all 500 products to all of its vendors this has given your data team increased visibility into the best products to source and sell best of all your suppliers receive a smoother consistent flow of valuable recommendations enabling each vendor to tap into the waves of user demand like never before with realtime feedback and actionable insights suppliers can fine-tune their product descriptions ensuring they resonate with users on a whole new level hello and welcome to this video about Process Management we are going to dive into some great strategies for keeping your data projects progressing smoothly data analysts usually have sever several projects going on at the same time all at different stages and involving different tasks and teams managing this work simultaneously requires a talent of a skilled juggler gracefully keeping several balls in the air at once each project represents a different ball and you must carefully coordinate your movements to maintain control over all of them with the well organized internal process workflow you can become a masterful juggler skillfully switching between tasks and preventing any balls from dropping there are three key parts of a typical data analyst workflow using a data request Central system checking in code and keeping an internal record of your work let's review each one now first it's important to have a data request Central system to store and manage business data requests and related team conversations this system provides relevant documentation enables coll collaboration and preserves historical records documentation is a process for preserving all of the details and conversations about each request in one place making information easy to find when a data analyst needs to find or request it collaboration happens when members of the data analytics team visit the central system to review data requests including queries and data delivery methods historical records are past data or information that's been stored and can be accessed or retrieved when needed this supports collaboration and helps refresh your own memory for example maybe you can't remember why you wrote a specific query a year ago if it's documented you can go back to the original request review your notes and remind yourself of why you wrote it this can be a great timesaver for new projects and while we're on the subject of queries it's also best practice for any teams who write code to check their code into some some sort of central repository GitHub is a popular tool for this kind of sharing to check in code is to upload code to a main repository so others can access it and review it you can check out a copy of the code and teammates will know that you are actively working on a copy of the code then when you are ready you can request a code review so you can get sign off from your teammates once they approve the code you can check in the code and now everyone will have access to the changes you made you just unpacked the importance of checking in code for Effective teamwork now you Explore More benefits of using a check-in system with a central repository to help you manage the code you check in first more effective collaboration another analyst on your team can review your code to catch potential oversights or errors before using it in production next if a data report comes from a centralized repository updates can be made made there directly avoiding the need to rewrite the report or create inconsistencies with changes in one place and not in others third having code reviews and keeping all the Live code in one place can improve code quality by ensuring consistency following a style guide and enhancing code readability in terms of using appropriate syntax and well-placed comments for major changes to dashboards or reports a code sharing syst system lets you track code easily the next Advantage is having an easier way to revert code this system lets you undo changes with a single click making it quick and simple to troubleshoot and fix code errors next revision history reveals when and who changed or added code helping you trace its Origins and reach out to the relevant data analyst if you need clarification finally keeping all your code in one repository makes it easier to locate after all you and your team will always need to know where to find your code files as a final strategy keeping an internal record of your work is a great timesaver you can use the same system when stakeholders submit data requests to create and track your own tasks for different projects and logging your tasks in a central place not only keeps you organized but also allows you and your team to share progress juggling multiple projects is all part of a day in the work life of a data analyst but here's the good news if you've got an efficient system that handles data requests manages code check-ins and tracks all your work then you've got this you'll have a super organized workflow and delivered top quality work on all your projects welcome data Wiz thanks for coming along for this journey into data management in your role as a data analyst you may receive data requests in this video we're going to focus on strategies for expertly hand handling these business data requests a business data request is any business question that can be answered with data so when a stakeholder needs help from the data team they'll typically use a ticketing system to submit their request then the data analyst uses that same ticketing system to track and prioritize the request while ticketing systems vary usually they allow people in the business to submit a question or issue or ask for a new Computing feature the system system then helps people track their requests from start to finish other examples of data requests may include asking for a new report or asking for modification to an existing dashboard or report generally each ticket submitted will have a Details page comments are written about each item so all communication about the request exists in one place the general elements of each data requests are type priority and Status the type category can be set by your or organization to help group requests together you might group requests about a similar issue or by the person who's responding grouping the requests allows you to address all aspects of an issue as different people may point out different problems the issue priority category does exactly what it says helps people determine the priority for each issue and the issue status allows the data analyst to provide a real-time status update about how the request is progressing besides the ticketing system it's important to intake issues thoroughly this means collecting as many details as you can upfront to avoid a lot of back and forth later when taking requests consider what when who where why and two how's let's illustrate all of this with an example maybe as an analyst you're working for a nonprofit solar energy initiative your organization is tracking how a particular program is affecting smaller communities outside a city you've been taking data requests from all stakeholders in any form and it's starting to get confusing so your data team institutes a ticketing system through this new system stakeholders can now submit business data requests to ask for reports data clarifications reference data data extracts and more all right here comes your first issue in the tracker you receive a data request asking for a account of communities affected by the solar energy initiative so you ask the following first the what should all solar energy initiatives be counted or only certain types second when how far back should the Daya go Justice calendar year the last 365 days or for all time next who will all communities be included in the data set or only a subset okay now you've come to where so you ask should the data be stratified by any particular region or zip codes when you stratify something you divide it into groups all right now why it's important to understand the business context in which the stakeholder is coming to you with this request for example what are the business questions this report would help answer thinking about context also helps guide an additional questions you may have missed asking in the data intake process next is the first how this how is about refreshing the data so you ask how often should this data be refreshed is this just a one-time pool or will it be needed on an ongoing basis and finally the second how is about data delivery for this you consider how the stakeholder wants the data delivered they might prefer a lightweight static report like a spreadsheet or a more robust Dynamic dashboard these are a lot of details to cover with your requester thankfully having a system allows you to document the details so you can easily refer back and iterate if needed with this info you're able to get back to your stakeholders with valuable Data Insights but wait you've just received another ticket now one of your stakeholders wants to know how many buildings have reduced energy costs compared to buildings that didn't opt in the Solar initiative when you start to research this question you realize that you need to First create a report listing which buildings received solar energy in which year then you'll need a report that segments those buildings by start date for these tasks you create something called a parent child relationship luckily this is another great feature of ticketing systems a parent child relationship enables users to divide a single ticket into sub tickets which can be worked on simultaneously sometimes even by different team members so for example identifying which buildings received solar energy over the past year is a separate task that needs to be finished before you can answer the main question related to overall solar savings with this in mind you create a child requests for each new report and attach them to the parent the original requests you received in the ticketing system you then Define the relationship between the requests and order them in a way that makes sense great work now now you're prepared to answer the stakeholders latest question as a final Point let's go over how to track all of the requests in a ticketing system for this you'll use status Fields some common status fields are assigned start work in progress fixed verify and reopen your organization May customize these based on their needs but the key is that the status Fields enable everyone to easily understand the status of a request all also saving fixed and verified requests creates a handy database of previously answered questions so for the purposes of your solar energy project you'd indicate the issue regarding the number of buildings using solar that needs to be verified and the request recording the total amount of savings nice using an efficient tracking system enables you to share data quickly and accurately it improves overall communication and collaboration and it'll definitely make make your life easier when you have numerous requests coming in for your expert data work hello and thanks for being with me to discover some great ways to keep data team members on the same page and I do mean page because this video is all about data documentation data documentation is a written Guide to the data contained in a data set how it was collected and how it's organized data documentation typically includes the purpose of collecting the data the procedures followed the date and time data was collected the structure of the data set and any notes about data validation or quality assurance data documentation may come in the form of readme files a data dictionary a code book a lab notebook a spreadsheet and more let's get into this with an example maybe you've just joined a data team as an analyst at a Cutting Edge Renewable Energy company as you start to become familiar with the data available to you you notice that wind and Geo thermal energy are recorded on different spreadsheets with different columns and notations and you notice that the different departments within your organization are keeping their own data tables and integrating them together for reporting you also wonder why your team only has access to data for the last 2 years when there is at least 10 years worth of data available just then one of your new co-workers hands you the teams data Playbook and documentation for the data sets you're using they explain that the Playbook documentation defines the data which will help you understand the information you're working with full of curiosity and momentum you delve into the Playbook ready to uncover solutions that will bring Clarity to the data puzzle you discover that the teams data Playbook provides the company's overall plan for handling and managing data with team team specific processes for a data analyst the most important elements of the team Playbook are information about how to request data access how to Grant data access to others where your team stores data tables in big query and how to carry out common tasks as an added bonus sometimes a team Playbook includes sample queries for common requests like merging data as you review The Playbook you start to understand why the wind and geothermal data are kept in separate files you find out out that there are certain individual projects that are crucial to the environmental success of the company and in some of the case studies included in the Playbook you find some great examples of how the data has been merged together in the past for other initiatives you join your team in a retrospective and discuss the benefits of sharing the Playbook to complete your data analyst tasks you learn that the data Playbook is a living document meaning it's updated as needed so when you become more more accustomed to working on the team they look forward to your contributions as well now you understand the importance of data documentation like a data Playbook and how teams follow documentation procedures to collect and manage data having an up-to-date Playbook will help ensure you and your data team stay on the same page congratulations you've accomplished so much during this section of the program you're well on your way way to becoming a cloud data analyst now you understand lots more about the work of data analysts in cloud data and how you might collaborate on a data team you also discovered the basics of some key cloud data tools and processes along the way you focused on data Gathering Plus data access management and storage tools this also included an investigation into data processing and transformation in this section covered a lot about dat data requests data documentation and team playbooks thanks again for joining me and well done you're making outstanding progress as a hiring manager I really want to screen people in rather than screen people out we're always looking for reasons to say yes to people rather than to say no I'm Vince and I'm a cloud data engineering manager that means that I work with a team of data Engineers to help customers solve data and analytics challenges the qualities we look for in can Cates are um General facility with data uh you know a little bit of an analytics background is is always good familiarity with uh Technologies like SQL Big Data Technologies and um batch and stream processing workplace skills are critical workplace skills are almost everything else that you do other than the hard technical skills that you exercise in your job so the ability to collaborate the ability to find compromise the ability to lead all these things go into uh being a really strong employee when it comes to showcasing your workplace skills it's not just important uh what you would do to solve solve a problem but how you would solve the problem helping your interviewer understand your process and how you get to a solution is just as important as a solution itself I think confidence is really important when I'm interviewing a candidate I like to see that a candidate is comfortable in uh in a situation like interviewing which is inherently kind of uncomfortable it's a good signal if someone shows up with with a high degree of confidence to an interview that they'll show up well when they're in high stake situations with customers candidates have a lot of power in the interview process if you've made it to the point of uh an interview then the company views you as someone who might be someone who they want to work there I think it's important to ask questions of your interviewer that would give you some insight into what it's actually like to work at the company you're interviewing for questions that you might have about things like work life balance one question that I always like getting from candidates is what's one thing you would change about the company that you work for if you're considering getting started in a career in data analytics I would really encourage you to find some public data and use some of the great open source tooling out there and start playing with some data it's a great way to build experience and also see if it's something you'd like to build a career around hi hi I'm Vince hi I am Monika congratulations on making it through this course now you're going to get a sneak peek on what an interview on the topics in this course would be like we hope this will let you know what to expect in your next interview what interests you in a career in cloud data analytics I have always been very passionate with numbers I understood and work them very well in through high school I remember math being one of my with my highest score in my report card and then when I have to choose for a career path I decided to do computer information system which was like the business and also like the technical side of analysis one of the projects that I work well in the University was to create a data set schema and I remember then I was amazed by everything data um so that's what got me into data analytics so tell me about your work experience and how that prepares you for a role in data and analytics uh previous role that I had was an IT technicial part of my role was to be responsible creating a dashboard to show the usability of a tool after the analysis we concluded that the tool was being used but just by promoting it a little bit more could be more useful and uh decrease the time of troubleshooting for the text what are some ways that cloud computing can be challenging for a company Cloud computer can be challenging for a company integrated uh data from different sources also compliances they have to meet the strict regulators and the Privacy standard can you talk a little little bit about your experience with either of those uh two areas sure um I would say when I created the dashboard that I was talking earlier um all the information that I was using was for different data source so I have to basically clean and then um join them using the same attributes so imagine this scenario a stakeholder approaches you and says that they need a report on company payments as an analyst how would you intake this request if a stakeholder comes to me asking for a report of payments I will ask some questions some of the question that I would ask is what is the objective of this analysis so I know the purpose of it um I would also ask the times does he wants to do an analysis of a year of two years of like just a month so I can gather the information the data that I just need for that dashboard and lastly I would ask if there's any pii or data security request for this dashboard all right so you've completed your requirements Gathering what are a couple of things you might uh need to start the analysis so the first thing that I would do is to uh make sure the data is valid the one that I'm working with and after that I would just analyze it using um query what would you query the data with I will use SQL to to quate the data what's an example of a system that that might enable you to queri it with SQL sure one example of that could be uh be query okay Monica I want to thank you for interviewing with us I really wish you good luck with the rest of the process thank you Vincent thank you so much for your time and consideration in this scenario Monica showed how to talk through your answers and show your thinking by sharing your thought process you demonstrate what you know and how you solve problems problem solving is a top skill hiring managers are looking for that's it for now stay tuned for more tips congratulations on finishing this course you are well on your way to accomplishing great things in cloud data analytics my favorite part of working in cloud data analytics is the opportunity to provide answers and insights that a user stakeholder or customer may have never thought possible making data available usable and accessible is no small feat and with the right tools and skills you can really change how people think about problems and make decisions as someone who loves helping people that's exciting now let's take a minute to go over all the things you learned in this course you began by exploring an introduction to the program and some tips for successfully completing the certificate you also learned about cloud computing its components in cloud computing versus traditional Computing then you explored cloud data analysis versus on premises data analysis and you learned about the impact of cloud data analytics on all kinds of businesses with the special focus on the Google Cloud architecture framework next you discovered the inner workings of data management and the data life cycle and the cloud data analyst roles in keeping both running smoothly you also explored Cloud team collaboration and how this helps teams to create some really cool business projects together finally you discovered what cloud data tools are in an analyst toolbox and learned about the importance of data documentation in data analytics you now know about cloud data tools and you can understand and communicate Cloud benefits share timely insights and so much more congratulations on your progress so far you're off to a great start [Music]