前段时间在技术交流群,大家都讨论了淘宝商品主图、价格、标题的稳定获取sku完整的解决方案。这引起了我对技术挑战的兴趣。
目前自己做了压测,QPS出滑块概率极低,API整体稳定,能满足业务场景的性能需求。
另外,之前评论区也有人私信过我。
那我就在那里CSDN为您提供技术思路,使用Java版本代码为您提供演示。
包括请求和模拟:
//联系平台获取 String itemUrl="{平台获取}"; //构建get请求 HttpURLConnection connection = null; InputStream is = null; BufferedReader br = null; String result = null;// 返回结果字符串 try { // 创建远程url连接对象 URL url = new URL(itemUrl); // 通过远程url连接对象打开连接,强转成httpURLConnection类 connection = (HttpURLConnection) url.openConnection(); // 设置连接方式:get connection.setRequestMethod("GET"); // 设置连接主机服务器的超时间:1.5万毫秒 connection.setConnectTimeout(15000); // 设置读取远程返回的数据时间:6万毫秒 connection.setReadTimeout(60000); // 发送请求 connection.connect(); // 通过connection连接,获取输入流 if (connection.getResponseCode() == 200) { is = connection.getInputStream(); // 封装输入流is,并指定字符集 br = new BufferedReader(new InputStreamReader(is, "UTF-8")); // 存放数据 StringBuffer sbf = new StringBuffer(); String temp = null; while ((temp = br.readLine()) != null) { sbf.append(temp); sbf.append("\r\n"); } result = sbf.toString(); } } catch (MalformedURLException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally { // 关闭资源 if (null != br) { try { br.close(); } catch (IOException e) { e.printStackTrace(); } } if (null != is) { try { is.close(); } catch (IOException e) { e.printStackTrace(); } } connection.disconnect();// 关闭远程连接 } return result; }
webDriver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS); //1.进入淘宝 webDriver.get(testUrl); webDriver.manage().addCookie(new Cookie("thw","cn")); webDriver.manage().addCookie(new Cookie("_l_g_","Ug==")); webDriver.manage().addCookie(new Cookie("lgc","\u6731\u5FD7\u677E88")); webDriver.manage().addCookie(new Cookie("cookie1","UoNoTo/TdEXMCnhnlgHclN7PZN284TnOPEj92rBNYTE=")); webDriver.manage().addCookie(new Cookie("existShop","MTYyMTU4NzgzOQ==")); webDriver.manage().addCookie(new Cookie("cookie2","14f49530bf8330d6b22eaf3acdc24251")); webDriver.manage().addCookie(new Cookie("sg","837")); webDriver.manage().addCookie(new Cookie("cna","e2UuGXr8AFICAXFFqm2pMDQL")); webDriver.manage().addCookie(new Cookie("skt","694d43f333700ce1")); webDriver.manage().addCookie(new Cookie("_tb_token_","e5d98e35b7ee3")); webDriver.manage().addCookie(new Cookie("xlly_s","1")); webDriver.manage().addCookie(new Cookie("dnk","\u6731\u5FD7\u677E88")); webDriver.manage().addCookie(new Cookie("uc1","existShop=true&cookie14=Uoe2zEJWu0+7Iw==&pas=0&cookie16=WqG3DMC9UpAPBHGz5QBErFxlCA==&cookie15=VFC/uZ9ayeYq2g==&cookie21=VT5L2FSpdeCjwGS/FqZpWg==")); webDriver.manage().addCookie(new Cookie("uc3","nk2=tacDJDHV1/c=&id2=UoYcAK2oRM6BeA==&lg2=U+GCWk/75gdr5Q==&vt3=F8dCuw++UgXqLx3riVk=")); webDriver.manage().addCookie(new Cookie("tracknick","\u6731\u5FD7\u677E88")); webDriver.manage().addCookie(new Cookie("mt","ci=5_1")); webDriver.manage().addCookie(new Cookie("uc4","id4=0@UO6VjxMTD4dlqn3KIVPnTkBcXgrQ&nk4=0@txMIDHit+SCJ5W5/1fajRpQ/cw==")); webDriver.manage().addCookie(new Cookie("unb","1723573803")); webDriver.manage().addCookie(new Cookie("tfstk","c7ccBn0NvusQjnwodINjkQ3aAYAdwIQzDcojafJQwn-Fa_f0ETWU6G9uqmcOC")); webDriver.manage().addCookie(new Cookie("_samesite_flag_","true")); webDriver.manage().addCookie(new Cookie("l","eBrffORnj6qP19W9BOfanurza77OSIRYYuPzaNbMiOCPOBfB5AkeX6skHXL6C3GVh6SDR3uh7KIMBeYBc7Vonxv9w8VMULkmn")); webDriver.manage().addCookie(new Cookie("_cc_","V32FPkk/hw==")); webDriver.manage().addCookie(new Cookie("cookie17","UoYcAK2oRM6BeA==")); webDriver.manage().addCookie(new Cookie("_nk_","\u6731\u5FD7\u677E88")); webDriver.manage().addCokie(new Cookie("sgcookie","E100zgp6%2FfkWrLApPdO9bSq5bShP0y6SrjiCUVn%2BGELKNlOjwwYSdcKaWxVHSu1XYUmxE%2BklKp86woHeFrq0qC65Tw%3D%3D"));
webDriver.manage().addCookie(new Cookie("t","f875ad8be099868f7620a96050fc4fb7"));
webDriver.manage().addCookie(new Cookie("csg","fd0548e9"));
webDriver.manage().addCookie(new Cookie("isg","BP7-BeR2Eg7z8kYqrwyo6sArTxJAP8K5FobJFagHaME8S5wlH8zXyIu5xReH6LrR"));
webDriver.get(testUrl);
Thread.sleep(2000);
返回的数据字段如下,我在response的json对各字段做了注释说明。
{
"msg":"获取成功",
"code":0,
"data":{
"productId":"65272193xxx", //商品ID
"shopLogo":null,
"productImg":"xxxxx", //商品主图
"shopName":"xxxxx",//商品所属店铺
"rootCategoryId":null,
"productTitle":"",//商品标题
"defPrice":"949",//商品价格
"sellerId":"890482188",//卖家ID
"brandId":null,
"shopId":"71955116",//店铺ID
"shopType":"B",
"shopWw":"xxxxx",
"categoryId":null},
"state":true
}
有疑问的或者对爬虫感兴趣的同学,欢迎在评论区交流。