Data Science: a novice approach
1.Introduction:
The data science is hot until today 2018. I read a lot of article relate to this subject and I know many people excite to reach the data science but it's a huge domain that the explorer can be lost in a dark forest of knowledge. I used to research about the data science day by day with a different level( from zero to unknown where). My method is find out it as everybody did: search the keyword, read some books, try to practice somewhere. But all I get back are a chaos of objects. My sense about the data science is the same as an illusion. I forget it about 1 year and get back beside it. So this time, I try to illustrate the way to reveal the data science in short description. I write down to teach myself, to clear, to share.
What is the purpose of the data science? I remember one thing that is "get insight from messy data". I do many tasks in the data space to extract something that show the demanding values. The final goal is data meaning. A data machine have a structure includes the messy data input and values data output. The data science cover all process to product a good data output. A simple image that I take from google specify the complication of the data science
2.Principle:
The key is around data. So what is the data? I present the simplest definition of data in my point of view. Data is a collection of information that can be organized or unorganized. In the information age, the data become more and more big. I want to get back the value from big data. I need tools, methods, strategies to complete that job. I do it on the data as I do the science. The data science have to observe and experiment on the data object.
I must choose a suitable scope because the data science is multidisciplinary.
3.A novice approach:
I make section 1, 2 in short as I can. The point I want to approach as a novice in data science is the observation and experiment with data object
I imagine that my tasks are in a machine. This machine get input data and product output data. I show an overview structure:
- Input data: organized data/ unorganized data.
- Process: transform input data to output data .
- Output data: data as expected value.
I have the data object. I select the goal. I dig in the input data, search for the essence. I manipulate the data, mix it by process. I apply the scenario, loop it until get an acceptable results. I create 2 sub section below with given that I know the end target.
3.1. The first, the data observation:
The data science is hot until today 2018. I read a lot of article relate to this subject and I know many people excite to reach the data science but it's a huge domain that the explorer can be lost in a dark forest of knowledge. I used to research about the data science day by day with a different level( from zero to unknown where). My method is find out it as everybody did: search the keyword, read some books, try to practice somewhere. But all I get back are a chaos of objects. My sense about the data science is the same as an illusion. I forget it about 1 year and get back beside it. So this time, I try to illustrate the way to reveal the data science in short description. I write down to teach myself, to clear, to share.
What is the purpose of the data science? I remember one thing that is "get insight from messy data". I do many tasks in the data space to extract something that show the demanding values. The final goal is data meaning. A data machine have a structure includes the messy data input and values data output. The data science cover all process to product a good data output. A simple image that I take from google specify the complication of the data science
Data Science Overview |
2.Principle:
The key is around data. So what is the data? I present the simplest definition of data in my point of view. Data is a collection of information that can be organized or unorganized. In the information age, the data become more and more big. I want to get back the value from big data. I need tools, methods, strategies to complete that job. I do it on the data as I do the science. The data science have to observe and experiment on the data object.
I must choose a suitable scope because the data science is multidisciplinary.
Image from google |
I make section 1, 2 in short as I can. The point I want to approach as a novice in data science is the observation and experiment with data object
I imagine that my tasks are in a machine. This machine get input data and product output data. I show an overview structure:
- Input data: organized data/ unorganized data.
- Process: transform input data to output data .
- Output data: data as expected value.
example form google |
3.1. The first, the data observation:
I pick up my data object, looking it. What kind of it is? Does it's properties reach my design? How long does the process execute this data? Does the output data meet the goal? If not, how does it change to fit? I should have a good solution.
3.2. The second, the data experiment:
I pick up the solution, do it. What kind of system is? Does it's characteristic resolve the solution? How long does the system execute this solution? Does the solution is effective? How many times that the system loop to fit? I do this experiment in real world.
Nhận xét
Đăng nhận xét