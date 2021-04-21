



The process of data analysis typically involves five steps:

Collection of research purpose data Preparation of data Interpretation of data survey results

Setting the purpose of the survey means that you need to know which questions you need to answer. What do we really want to know and what are our points of interest? This affects the data, parameters, and variables needed for the investigation. Select the data source you want to search accordingly.

Once you have determined the purpose of your research and the type of data you need, you need to collect data from the available resources. The important thing is that the data should be as accurate and complete as possible. Otherwise, all our research could lead to false results and would be useless.

However, it is a very rare situation if you have clean data that you can investigate immediately. So we take what we have, clean it and prepare for the expedition.

You can only examine the data after performing all the steps so far. This may include the use of descriptive statistics, modeling, and visualization of the most useful tools. At this stage, we can confirm or deny our assumptions and get answers to the questions we asked to set goals in the first stages of the study.

And finally, we come to a conclusion that we can share with our target audience, stakeholders, or ourselves.

Now let’s start and carry out all these steps step by step.

As mentioned earlier, this project has two purposes. Training in data analysis and receiving feedback from the Google Fit application on sports activities. By using this application on a daily basis, you will be more engaged in sports and will increase your sports activities over time.

The raw data for this survey was taken from the Google Fit application installed on mobile phones in September 2019. You must use the Google Takeout System to download data from this application. The system is designed to provide users with data from a variety of Google sources. All you have to do is register there and specify what kind of data you want to receive from the application, and Google will send them to your email in CSV format.

For Fit Application, Google provides a set of files for each day for a specified time period and a file that contains the aggregated data. Use the last one as it contains all the information you need. The survey period is from November 1, 2019 to December 31, 2019.

Our research uses a Jupiter Notebook with the Python modules Pandas and Matpoltlib.

Let’s start by importing the required modules

Import the panda as pdimportmatplotlib.pyplot asplt

Read data from a file.

Data = pd.read_csv (‘google-fit-data-file.csv’)

Let’s take a quick look at our data first:

data.info ()

You can see that the dataset has 92 rows and 25 columns. There are quite a lot of empty cells. Some data does not exist at all (height, heart points, etc.). Think about what to do with it later.

Now let’s look at the first few rows of data.

data.head ()

Very annoying and not useful. By the way, this is why I love data science. Data science can provide interesting and sometimes unexpected insights from this number of confusions. I will order from this mess soon.

Now that you know what your data looks like, you can start preparing for further investigation.

3.1.cleaning

First, you need to clean up your data. What do you mean? That means that everything that has no value or importance to the study should be removed. For example, you don’t need some columns in your dataset. Because these are not currently relevant to me.

As you can see in the data information, the Height and Heart Beats columns are empty. This analysis does not process geographic data, so no latitude and longitude columns are needed either. I don’t want to talk about weights, so I’ll drop all columns that contain this data. And since we’re talking about sports, not rest, we don’t need a column with rest and sleep time.

You can use the panda’s drop feature to remove these columns.

axis = 1 means to process columns instead of rows. You can use inplace = True to modify the dataset itself instead of duplicating the table to store new data.

3.2.format

From time to time (frankly, most of the time) we have to change the data we have. There are many reasons to do this, but at the end of the process, there is a better data format or the data itself, which makes it easier to understand, explain, and visualize. You can see what I mean by the example below.

3.2.1.Type conversion

First, I want to convert the date column type to datetime format. You get the opportunity to work with dates as dates, not strings. And that can be done with the built-in panda to_datetime function:

data[‘Date’] = pd.to_datetime (data)[‘Date’], Dayfirst = True)

3.2.2.Processing of missing data

It also fills all NaN (empty) values ​​with 0s so that the panda recognizes them as numbers and doesn’t miss any values. Needed for further analysis and visualization. This can be done with the fillnapandas function.

data.fillna (0, place = True)

3.2.3.Change of dimensions

Also, if you look at the data, you can see that the dimensions of some values ​​are incorrect. It is very difficult to handle time in milliseconds and speed in meters / second. Therefore, convert the numbers to more familiar units and rename the columns accordingly.

Now the dataset looks much better and clearer.

3.3.Add information

You may need to add information to your dataset for some purpose. For example, I would like to add one column to my data table: the day of the week (really interesting, but whether my sporting activity depends on it). In pandas, every datetime object has a built-in dayofweek property. When applied, you will receive weekdays numbered from 0 (Monday) to 6 (Sunday) by date. I don’t want to use numbers. It does not provide any information. So I convert them to words and change the order of the columns a bit.

After all these transformations, the dataset looks much better.

And now let’s start the most interesting part of understanding the data we have. Here, for the sake of brevity and simplicity, we will investigate and visualize only some of them. However, if you’re interested, you can find the full version of this project with a link to the GitHub repository at the end of the text.

4.1.General activities

understood. The dataset is clean and ready for exploration. Let’s dive into it.

Now let’s start with a general indicator of my daily activity. The active part, that is, how many minutes you played sports every day. Let’s draw a simple graph.

interesting. Looks like a hard beat. The number of minutes that Im is active is not constant every day, but increases day by day and month by month. It seems to be random.

understood. At least if there is no logic here, you can answer one important question. It’s about doing enough to stay healthy as much as possible. We know that the World Health Organization recommends working at least 150 minutes a week. Let’s see if my activity is within this framework.

First, you need to calculate the sum for each week. We found that last week contained only one day, so discard this value.

You can now create a plot of this data. Add a straight line to the 150 minute value to see if your weekly activity exceeds it.

Looks good. At least I can say that I’m doing my best to stay healthy.

But in reality, how many minutes do I usually play sports? Histograms help answer this question.

You can see that this histogram is almost normally distributed. The most common interval is 90-120 minutes. Not so bad. But after two and a half years of home, the number of days drops sharply. And I haven’t walked for more than 3 hours.

And the last question I have about these common values: do they depend on the day of the week? For this question, I added a “day of the week” column to the dataset.

First, count all the total number of minutes for each day of the week. You can then calculate your average daily activity in minutes. However, it must be accurate. The number of days of the week may vary. Therefore, you need to count how many times you see it in your dataset each day. Then you can calculate the average active minutes for each day of the week.

You can draw a graphic that visually shows these results.

It looks very natural. Like many others, I do a lot on Mondays and prefer to stay home and rest on Saturdays (I need to know that Sunday is the work day I live in) There is).

4.2.Type of activity

And now it’s time to talk about all the different kinds of activities I’ve done in the last three months.

Impressive, oh? I did almost nothing but walk around. That’s over 75% of all my activities. Second, swimming, then a little gymnastics, and almost invisible running, gymnastics, Pilates.

OK. It’s not time to be upset. You can improve it later.

4.3.walking

As long as walking is my main activity, let’s talk more specifically about it.

You can see that the average walking time per day is 75 minutes. Over an hour. When we go to work or store every day, we don’t even think about how much time we spend moving from one place to another. But it’s important because it affects our entire life.

But what is the distance I walked all the time?

The minimum distance was 0. This is logical. Of course, there were days when I was at home. And the maximum distance I walked was a little more than 12 kilometers. That pretty. Let’s see when that happened.

November 28th. Oh, I remember that day. I traveled with my family. It was a really cool day.

And another output from this data is that I usually walk more than 4km every day. Not so many, but not so few, oh?

And the last question about distance: how many kilometers did I generally walk during those three months?

Wow! Over 400,000 kilometers!

And another metric related to distance is the number of steps. The World Health Organization recommends taking more than 10,000 steps daily. Let’s see if I succeed.

Frankly, it’s not. You rarely perform the required number of steps. Let’s count the percentage of days that meet this condition.

I don’t think it’s enough. I should work on it.

Now you need to answer the questions you asked to start the investigation and confirm or reject the assumed theory. As far as you can remember, I thought using Google’s sports application on a daily basis would make me sportier.

Well, that’s wrong. None of the graphs tend to increase constantly over time. Therefore, you cannot work with any kind of app using this app. But it can show what is really happening. And with this data, you can decide what to do. It’s up to you, and that’s good news.

Now I know that walking is my strength. But I have to make a really big effort to push myself into regular training. And I shouldn’t rely on external stimuli. The motive is inside. I need the courage to admit it and obey it this way.

So I researched the data from the Google Fit application for three months. We looked at what data we had and cleaned up, reorganized, described, and visualized them. I drew graphics to identify relationships and patterns and explain them.

Use data to learn more about the phenomena that surround us, draw scientific conclusions that shed light on what was previously considered unknown, and make informed decisions in all areas of the application. I can. That’s why I like data and data analysis. I would like to continue studying in the future. If you find something interesting, share it with people.

And here is a link to the GitHub repository that contains the full version of the project: https: //github.com/shebeolga/Google-Fit-Data-Analysis. Thanks for your comments and additions to the topic.

What Are The Main Benefits Of Comparing Car Insurance Quotes Online

LOS ANGELES, CA / ACCESSWIRE / June 24, 2020, / Compare-autoinsurance.Org has launched a new blog post that presents the main benefits of comparing multiple car insurance quotes. For more info and free online quotes, please visit https://compare-autoinsurance.Org/the-advantages-of-comparing-prices-with-car-insurance-quotes-online/ The modern society has numerous technological advantages. One important advantage is the speed at which information is sent and received. With the help of the internet, the shopping habits of many persons have drastically changed. The car insurance industry hasn't remained untouched by these changes. On the internet, drivers can compare insurance prices and find out which sellers have the best offers. View photos The advantages of comparing online car insurance quotes are the following: Online quotes can be obtained from anywhere and at any time. Unlike physical insurance agencies, websites don't have a specific schedule and they are available at any time. Drivers that have busy working schedules, can compare quotes from anywhere and at any time, even at midnight. Multiple choices. Almost all insurance providers, no matter if they are well-known brands or just local insurers, have an online presence. Online quotes will allow policyholders the chance to discover multiple insurance companies and check their prices. Drivers are no longer required to get quotes from just a few known insurance companies. Also, local and regional insurers can provide lower insurance rates for the same services. Accurate insurance estimates. Online quotes can only be accurate if the customers provide accurate and real info about their car models and driving history. Lying about past driving incidents can make the price estimates to be lower, but when dealing with an insurance company lying to them is useless. Usually, insurance companies will do research about a potential customer before granting him coverage. Online quotes can be sorted easily. Although drivers are recommended to not choose a policy just based on its price, drivers can easily sort quotes by insurance price. Using brokerage websites will allow drivers to get quotes from multiple insurers, thus making the comparison faster and easier. For additional info, money-saving tips, and free car insurance quotes, visit https://compare-autoinsurance.Org/ Compare-autoinsurance.Org is an online provider of life, home, health, and auto insurance quotes. This website is unique because it does not simply stick to one kind of insurance provider, but brings the clients the best deals from many different online insurance carriers. In this way, clients have access to offers from multiple carriers all in one place: this website. On this site, customers have access to quotes for insurance plans from various agencies, such as local or nationwide agencies, brand names insurance companies, etc. "Online quotes can easily help drivers obtain better car insurance deals. All they have to do is to complete an online form with accurate and real info, then compare prices", said Russell Rabichev, Marketing Director of Internet Marketing Company. CONTACT: Company Name: Internet Marketing CompanyPerson for contact Name: Gurgu CPhone Number: (818) 359-3898Email: [email protected]: https://compare-autoinsurance.Org/ SOURCE: Compare-autoinsurance.Org View source version on accesswire.Com:https://www.Accesswire.Com/595055/What-Are-The-Main-Benefits-Of-Comparing-Car-Insurance-Quotes-Online View photos