Data analysis final

20
1 Team 05 Data Analysis Team 05 Data Analysis 1 - The Prediction of Data - 20/10/2012 Nobuya Yoshizawa, Goshi Fujimoto, Atsuko Chiba, Xu Changji

description

 

Transcript of Data analysis final

Page 1: Data analysis final

1 Team 05

Data Analysis Team 05

Data Analysis 1 - The Prediction of Data -

20/10/2012

Nobuya Yoshizawa, Goshi Fujimoto, Atsuko Chiba, Xu Changjing

Page 2: Data analysis final

2 Team 05

Outline

1. Objectives2. Hypothesis3. Analysis process4. Result5. Conclusion6. Possible reasons7. Role of membersQ&A

Page 3: Data analysis final

3 Team 05

1. Objectives

Does the future investment cause the high performance of management?

What is Experimental and research expense?The special expense for studying and researching new product or new technology

⇒ Future investment!

11,49711,203

8,414

3,3192,3692,1301,9911,6281,3581,1991,1471,028 829 434 384

0

2,000

4,000

6,000

8,000

10,000

12,000

14,000

16,000

18,000

20,000

Expe

rim

enta

l and

Res

earc

h Ex

pens

e (K

Yen/

Firm

) 40,581

Page 4: Data analysis final

4 Team 05

2. Hypothesis

Large manufacturing company in Japan has a lot of employees and owns the laboratory to produce new products and technology.

When a company produces new products, they might be expensive in short range and cause profitable.

Since a company produces new products and technology, the total asset of the company must be high.

# of employee is high

Gross profit rate is high

Total Asset is high

When Experimental and research expense is high,

Page 5: Data analysis final

5 Team 05

2. Hypothesis

Scatter with E&R expense

Clear relationship between E&R expense and hypothesis variables. We are going to make the multi-regression model next…

Page 6: Data analysis final

6 Team 05

3. Analysis

To know deeply the objective data and find the correlation with various data

1. Overviewing the objective data

2. Making the correlation matrix

3. Picking up explanatory variables

4. Developing the multi regression model

5. Improving the multi regression model

Page 7: Data analysis final

7 Team 05

3-1. Overviewing the objective data

The overview of E&R expense1. Half of firms with no investment to E&R

2. Another half of firms with wide range of investment to E&R

Page 8: Data analysis final

8 Team 05

3-1. Overviewing the objective data

1. We are just interested in those companies which have experimental and research expense. So we decided to take the objective data of 815 out of 2090 companies.

2. We converted E&R expense to log10(E&R expense) as the objective variable to adjust the wide range numerically.

815 companies

(E&R expense > 0)

1275 companies

(E&R expense = 0)

Page 9: Data analysis final

9 Team 05

3-2. Making the correlation matrixTotalAsset

logTotalAsset

CurrentAsset

LongTermAsset

LongTermLiability

logE&R expense …

TotalAsset 1 0.589 0.603 0.960 0.936 0.426 …logTotalAsset 0.589 1 0.637 0.466 0.428 0.777CurrentAsset 0.603 0.637 1 0.354 0.311 0.529 …LongTermAsset 0.960 0.466 0.354 1 0.987 0.313 …LongTermLiability 0.937 0.428 0.311 0.987 1 0.279 …logE&Rexpense 0.426 0.777 0.529 0.313 0.279 1 …… … … … … … …

To find the explanatory variables which have the strong relationship with E&R expense.

To categorize the similar explanatory variables not to include multicollinearity.

Page 10: Data analysis final

10 Team 05

3-3. Picking up explanatory variables

Top variables which have strong relationship with E&R expense

Log Total Asset Log Current Asset Log Note And

Account Payable

0.777 0.766 0.706

Log Depreciation

Log Number of Employee Log Sales Income

0.760 0.756 0.748

Log Personal Expense

Log Aggregate Value of Listed Stock

Log BreakEvenPoint

0.741 0.787 0.697

Page 11: Data analysis final

11 Team 05

3-4. Developing the multi regression model Based on hypothesis and statistical approach, we

developed the multi regression modelHypothesis is the most important because

model must be easy to explain and be accepted to audience.

Then we tried to find the optimal explanatory variables without decreasing t-value and R^2

HypothesisA variable

B variable

C variable

D variable

Objective variable

StatisticsE variable

F variable....

Page 12: Data analysis final

12 Team 05

3-5. Improving the multi regression model An example for improvement

We have found the relationship withTotal asset: High negative correlationCurrent asset: High positive correlation

Then we convert total asset to current asset ratio (=Current asset / Total asset) to total asset as a very high positive correlation

Current asset ratio is more important than total asset to explain E&R expense because • E&R expense is counted as deferred current asset• Companies are more active than them with no E&R

Page 13: Data analysis final

13 Team 05

4. Result

Normalized coefficient P-value

Gross profit rate 0.258 P<0.001Current asset ratio to total asset 0.106 P<0.001Log Number of employee 0.090 P<0.05Log Inventory product 0.076 P<0.001Percentage of export 0.088 P<0.001Average salary 0.188 P<0.001Consolidated income ratio to single income 0.092 P<0.001Investment security 0.073 p<0.01Personal expense -0.139 P<0.001Log Note and account receivable 0.111 P<0.01Log Depreciation 0.489 P<0.001

Page 14: Data analysis final

14 Team 05

5. Conclusion I

Common characteristics : • High profit rate, total asset ,cash flow and• High investment on experimental installations and• High number of employees and salary and,• Large global companies.

R^2 = 0.750Improved model

R^2 = 0.5000model based on hypothesis

Strongly fittedSmallerresiduals

Page 15: Data analysis final

15 Team 05

5. Conclusion II As a result, we verified three hypothesis data and

one optimal data induced by improving multi regression model. (Refer to Slide 11)

Correlation The experimental and research expense is high

Gross profit rate

Verified The Capital Stock is correlated

Total asset Verified The total asset is correlated

# of employee is high

Verified The # of employee is correlated

Current asset ratio

Verified The current asset ratio is correlated.

Page 16: Data analysis final

16 Team 05

6. Possible reasons

IT bubble era in 1996 -NEC, Fujitsu spent Experimental and research expenses in 1996.

-IT bubble era, IT companies invested to market research and advanced technology to identify themselves from their domestic and foreign competitors.

Japanese manufacturing style-Large company, such as electricity, gas or exporting firms were afford to have laboratory, and spend the experimental and research expense.

Page 17: Data analysis final

17 Team 05

7. Role of members

Name RoleFujimoto Goshi(Leader)

-Facilitator-Analyzing data

Xu Changjing(Co-leader)

-Analyzing data

Chiba Atsuko -Analyzing data

Yoshizawa Nobuya -Preparing presentation slide

Page 18: Data analysis final

18 Team 05

Thank you for your attention.

Q&A

Page 19: Data analysis final

19 Team 05

Appendix – simple modellm(formula = logExperimentalAndResearchExpense ~ logNumberOfEmployee + logTotalAsset + GrossProfitRate)

Residuals: Min 1Q Median 3Q Max -2.02654 -0.27310 0.09164 0.37059 1.13517

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -2.674841 0.158979 -16.825 < 2e-16 ***logNumberOfEmployee 0.544842 0.082013 6.643 5.62e-11 ***logTotalAsset 0.708790 0.070862 10.002 < 2e-16 ***GrossProfitRate 0.014168 0.001247 11.365 < 2e-16 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.4997 on 811 degrees of freedomMultiple R-squared: 0.6715, Adjusted R-squared: 0.6703 F-statistic: 552.5 on 3 and 811 DF, p-value: < 2.2e-16

Page 20: Data analysis final

20 Team 05

Appendix – improved modelResiduals: Min 1Q Median 3Q Max -2.02046 -0.23227 0.07728 0.29399 1.24620 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -2.301e+00 1.676e-01 -13.731 < 2e-16 ***logNumberOfEmployee 1.553e-01 7.990e-02 1.944 0.052296 . logNoteAndAccountReceivabe 1.676e-01 5.764e-02 2.908 0.003743 ** logInventoryProduct 6.072e-02 1.624e-02 3.739 0.000198 ***logDeprecoation 6.306e-01 6.513e-02 9.682 < 2e-16 ***GrossProfitRate 1.588e-02 1.145e-03 13.873 < 2e-16 ***PerCapitaPersonnelExpenseKYen -7.497e-05 1.165e-05 -6.437 2.09e-10 ***RatioTotalCurrentAsset 6.073e-01 1.535e-01 3.957 8.25e-05 ***PercentageOfExport 4.642e-03 9.954e-04 4.663 3.65e-06 ***ConsolidatedIncomeToSingleIncomeRatio 1.598e-01 3.371e-02 4.740 2.53e-06 ***AverageSalary 3.255e-06 3.971e-07 8.197 9.72e-16 ***InvestmentSecurity 3.755e-06 1.172e-06 3.204 0.001409 **

Residual standard error: 0.4382 on 803 degrees of freedomMultiple R-squared: 0.7499, Adjusted R-squared: 0.7465 F-statistic: 218.9 on 11 and 803 DF, p-value: < 2.2e-16