Google Cloud Platform (GCP) -   BigQuery for beginners| Analyze data in google bigquery


Google Cloud, as Hercules was the strongest man on planet Earth. In that sense, BigQuery is a massive, massive product, a very powerful product provided by Google, with all its capabilities coming from years and years of data maintenance, which Google has done to maintain its own data.


Because you know how big the Google is in terms of maintaining its own data. So the technology which was used to maintain Google search engine, the same technology has been introduced into BigQuery to maintain a massive amount of data. So what is BigQuery? In a very, very layman term, it's a very big Big database. That's it.


It's a database like Oracle, like SQL Server, like my SQL, like any other database. Just consider it as a database and we will understand what differentiates BigQuery from other databases. So BigQuery is serverless. When we say serverless, it means that it is a managed service given by Google Cloud. So basically you don't need to understand and you do not need to.


And you don't need to worry about the back end infrastructure being used for running BigQuery. You just need to maintain your data onto it. That's it and rest all is taken care by Google Cloud itself. It's highly scalable. As I said, it can be scaled automatically to very, very high limits based on your requirements.


And the last but not the least. It is a very, very cost effective solution to deploy your data warehouse your data Lake solutions onto this platform. So BigQuery can analyze petabytes of data in second and in the diagram you can understand how Big one Petabyte is by just seeing the block of a terabyte. So what are BigQuery feature serverless? I have already explained real time analytics so it can help you crunch your data in real time.


There are technologies like Cloud Dataflow Pub sub which can stream and ingest real time data from your sensor from your IoT devices directly into BigQuery. It is automatically highly available so there is no downtime. It's around 99.9% availability which Google commits on BigQuery. You don't need to worry about standard SQL. So the standard SQL syntax is applicable on BigQuery as well.


Like any other databases, it is an ideal solution to deploy your data Lake and data warehouse solutions. If you want to know more about Data Lake, please go and watch my video on the same and the best part is storage and compute are separated. So BigQuery separates these two operations, automatic backup and easy restore and flexible pricing model. And last but not the least, access to a wide variety of public data set like GitHub, data blockchain and Bitcoin data. All these are massive data sets today in our exercise will analyze, stack overflow data.


So all this makes BigQuery a very, very unique product. Now BigQuery pricing. This is very important aspect of BigQuery because it is what makes BigQuery unique in terms of its overall usage and the cost you pay for the same. So BigQuery charges for data storage and streaming inserts and also querying the data. But Loading and exporting the data is completely free, so that is a massive advantage.


So when you store a data, you are only paying two cent rise two cent per GB per month. And if you are opting for longterm storage that is merely $0.01 per gig per GB. That is very, very cheap. So streaming insert the price is $0.01 per 200 MB. Loading, copying or exporting data metadata operations are all free.


Also based on different subscription type, your price could vary. So if you're opting for pay as you go model, then you are charged dollar 5 /tb and first terabyte per month is free. Flat rate pricing starts with $10,000 per month for a dedicated reservation of 500 slots. So there are when you opt for BigQuery, you opt for how many slots you would want BigQuery to deploy for you to store your data dollar 30 per month or per slot for flexible plots. If you want to know more about this, you can go and check Google Cloud website for detailed pricing.


So guys, time for demo. So let's go to my cloud console and let's understand the UI of BigQuery and transom data. Let's get started. So I am on to my Google Cloud platform console and the first thing which you need to do here is go to BigQuery. So I'll click on BigQuery.


Okay. So first thing which you need to understand is that, first of all, you need to understand the hierarchy of this BigQuery product. So first of all, comes your project. As always, I have always told you that project is pretty much everything that you do in Google Cloud comes through your project. So we have created a new project, YouTube BigQuery Demo.


If you want to know more about how we do all this, I have explained in numerous times in other videos of Google Cloud platform series. So go watch that. But after creating that comes the second hierarchy, which is a data set. So what is the data set when you click on this project? So this is the project which is shown here.


This is our project. If you click on this project, you need to create a data set. And this particular data set can be considered as a container in which all your tables, your views, everything will reside under this data set. So take for example, in a company, if you want to deploy BigQuery, you can have data sets based on departments based on functional area or any other way you would want to. So you can have your HR data set separated from your data set, like finance or your other data set can be it or purchasing or inventory all these kind of different data sets, sales you can create.


And then there can be segregation in terms of security in terms of identity and access management on these data sets, which is which is very, very good because you can exactly manage that. A certain Department can only access data in its own data set. So I will click on Create Data set. So I will name it YouTube Demo BQ data set.


And we'll leave a rest all as default. And I'll click on Create Data set. So once you create your data set, then you go and create tables within this data set. So you click on this icon, which is for creating this table. A new table.


So I'll click on Create table. So now I will give a name to this table, so I will name it Top underscore ten underscore Technologies, top ten technologies. That's it. And I will. It also asks you if you want to partition this table, which is an advanced concept.


Let me know if you want to learn more about it. But in this video, we are just covering the basics. So we'll just simply create this table. So this is.