Before performing regression analysis, you first need to clean the captured data (because regression analysis can only process numerical data, and cannot process non-numerical data such as text/empty data/garbled characters). In the “cleaned product portrait data” Excel table, data cleaning is mainly for the “rating quantity” data (because some of the “rating quantity” data generated an incorrect “S” symbol when captured). Click the “Sort and Filter” drop-down button in Excel, select the “Customize Auto Filter Method” option in the pop-up drop-down list, select the “Contains” option in the pop-up “Customize Auto Filter Method” dialog box, and enter the “S” symbol.
After completing the above steps, you can see all invalid data containing the “S” symbol in the “Rating Quantity” column.
After filtering out all invalid data, select all filtered data, then right-click and select the “Delete Row” command in the pop-up shortcut menu.
After deleting the invalid data, because the purpose of regression analysis is to explore the impact of different variables on the sales ranking of goods, the impact of differentiated pricing represented by “tiered pricing” on the sales ranking of goods is also within the scope of consideration. Therefore, it is necessary to add a new column of data to the original table, named “Goods Price Difference”, and its value is “Goods Price Range (Highest Price)” minus “Goods Price Range (Lowest Price)”.
Then create a new worksheet “Sheet2” in the Excel table, and copy the data of “Clothing, Shoes & Jewelry”, “Rating”, “Number of Ratings”, “Exposure Price”, and “Goods Price Difference” in the original table.
In the new worksheet, take the logarithmic function values of the data of the five dimensions of “Clothing, Shoes & Jewelry”, “Rating”, “Number of Ratings”, “Exposure Price”, and “Goods Price Difference”. Taking the LNO function of variables on both sides of the equation is a common mathematical technique in the field of marketing. In the field of Amazon cross-border e-commerce, because operators cannot directly obtain Amazon’s real sales data (only ranking data), the logarithmic function can be used to convert the original “sales change” into “ranking percentage change” (the relationship between the percentage change of the independent variable and the percentage change of the dependent variable).
In the “InBSR” column, in order to make the data display more accurate, select the column data in Excel, then right-click, and select the “Format Cells” command in the pop-up shortcut menu.
In the pop-up “Format Cells” dialog box, select the “Value” option, and then set the “Decimal Places” to “2”.
After completing the settings, save all the dimensional data of the logarithmic function to the new worksheet “Sheet3”, and save it by value.
Because there is no regression analysis function in the basic toolbar of the Excel 2016 version, you need to manually add the relevant function. The operation method is: click “Start” in Excel, and then select the “More” option on the selection page.
Click “Add-ins” in the pop-up “Excel Options” dialog box.
In “Add-ins”, click the “Go” button.
Check the “Analysis ToolPak” checkbox in the pop-up dialog box, and then click the “OK” button.
After completing the above operations, the “Data Analysis” function option will appear in the “Data” function in the operation bar above Excel.
Click “Data Analysis”, select the “Regression” option in the pop-up “Data Analysis” dialog box, and then click the “OK” button.
Set the variables in the pop-up dialog box, where the “Y value input area” corresponds to the dependent variable, that is, “InBSR”, and the “X value input area” corresponds to the independent variable, that is, “ln score”, “In score number”, “In exposure price”, and “In price range difference”. It should be noted that when selecting the dependent variable data and the independent variable data, you need to start from the “variable name”, that is, the first row, and then check the “flag” checkbox in the “regression” dialog box.
In the regression operation interface, select “Y value input area” and “X value input area” in turn, and then check the “Sign” checkbox after selecting the first row of variable names.
After completing all operations, the numerical results of the regression analysis will be generated in the new worksheet “Sheet4”.