Manual collection, as the name implies, is the operator collecting data on the Amazon platform through the basic “copy and paste” method. It is generally used for Amazon search exposure page data and product detail page data. The advantages of manual collection are no technical threshold, flexibility and convenience, and the disadvantage is low efficiency. Generally speaking, the time for manual collection of a single data is 5 seconds (the time to copy and paste the data on the Amazon platform into a table or database). An operator who works 8 hours a day can allocate 0.5~1 hour of effective data collection time per day. Taking the middle value of 45 minutes, the effective amount of data collected per day is 540 data. If a product needs to collect data in three dimensions, such as the number of reviews + ranking + price, then 540-3=180, that is, an operator can effectively track and collect data for 180 products in one day.
If the operator wants to improve the efficiency of manual collection on a single day, the author has the following two suggestions.
1. Clarify the purpose of data collection, so as to reduce the frequency of data collection and ultimately improve the collection efficiency. For example, if the operator wants to understand the sales distribution of different products under a search keyword, he can estimate it by manually collecting the sales ranking data of the first 500 to 1,000 products on the search exposure page, without collecting data from other dimensions, and there is no need to collect data once a day (because the sales distribution under a category/keyword will not change in a short period of time), which can improve the efficiency of data collection.
2. When it is difficult to achieve comprehensive data collection, you can use the sampling collection method to improve the collection efficiency.
For example, if the operator wants to analyze the distribution of reviews, rankings, and prices of the first 100 pages of products under a certain search keyword, but he has neither the ability nor the funds to develop a crawler program, nor has he found a suitable tool or third-party collector, then he can use the sampling collection method. In this case, the operator can set the first 100 pages of products into 100 groups, assuming that there are 48 products on each page, and can extract the 8th, 16th, 24th, 32nd, 40th, and 48th products in each group respectively. Then each group only needs to be collected 6 times, a total of 6×100=600 times. Considering that each collection involves three dimensions: review, ranking, and price, the total amount of data collected is 600x 3=1800. If the collection time of one data is 5 seconds, then the total collection time is about 2.5 hours, and all data collection can be completed within 1 week.