Amazon Advertise API 関連メモ書き

AmazonのProduct Advertising APIに関するメモ書き。備忘録なので非常に雑です。

APIの概要
APIに指定するパラメータについて
- カテゴリのリスト
Python用ライブラリ
参考情報
- Response Groups関連
- 商品に関するXML属性
参考URL

APIの概要

Amazonの商品データベースから各種情報をXMLで取得できるAPI。

商品の詳細や在庫情報など、取得できる情報は幅広い。

商品データベースに関する操作とカートに関する操作がある。

この記事では商品データベースに関するAPIのみ。

Search（検索）
- ItemSearch：キーワード（またはASINなどのID）でItem（商品情報）検索
Lookup（特定の項目の参照）
- BrowseNodeLookup：ブラウズノード（カテゴリあるいは出品者、メーカー）について情報を得る
- ItemLookup：ASINやISBNのようなIDでItemを検索
- SimilarityLookup：類似商品を探す

公式ドキュメント

Welcome - Product Advertising API

2013年8月1日付けが最新。

Amazon.co.jpではなくdocs.aws.amazon.comのドキュメントをたどる。

使用するには

まずAmazon アソシエイトに登録する
APIを有効化し、APIキーを入手する
Amazon associateのアカウントを作る（買い物用のアカウントとは別にした方が無難）
アクセスキーの取得

Amazon Product Advertising API関連リンク集

注意事項

目的外の使用に対して非常に厳しくなっているので注意が必要。

Product Advertising API (PA-API) の利用ガイドライン

月に最低一回、「（改変なしの）APIからのレスポンスに含まれるURL」から売上が発生していれば最低1秒1回の呼び出しは可能。

APIからのレスポンスに含まれるURLを改変無しで使用しろ、というのはかなり厳しい気がする。

転売用の価格調査用のDB系のとばっちりということなんだろう。

APIに指定するパラメータについて

すべての国で共通

カテゴリのリスト

SearchIndexを使う場合に必要。

Python用ライブラリ

bottlenoseというライブラリがあるのでこれを使う。

以下はAPIの挙動の確認に作ったもの。キーワード検索結果を取得するサンプル。

APIからのレスポンスそのものはXMLで保存している。後半部分で結果をパースしてPythonのリストにしている。 APIキーはdotenv経由で環境変数から。

#! /usr/bin/env python3
# coding: utf-8

import bottlenose

from bs4 import BeautifulSoup

from retry import retry
import os
import sys
import pathlib
from time import sleep
from dotenv import load_dotenv
from string import Template
from datetime import datetime

# [GitHub - theskumar/python-dotenv: Get and set values in your .env file in local and production servers.](https://github.com/theskumar/python-dotenv)
# [GitHub - lionheart/bottlenose: A Python wrapper for the Amazon Product Advertising API.](https://github.com/lionheart/bottlenose)



load_dotenv('./django_project/.env')

AWS_ACCESS_KEY_ID = os.environ.get('AWS_ACCESS_KEY_ID')
AWS_SECRET_ACCESS_KEY = os.environ.get('AWS_SECRET_ACCESS_KEY')
AWS_ASSOCIATE_TAG = os.environ.get('AWS_ASSOCIATE_TAG')

amazon = bottlenose.Amazon(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_ASSOCIATE_TAG, Region='JP')

response_group = 'Images, ItemAttributes' # response_groupに何も指定しないと画像のURLは取得できない?

response_text = None

cached = False

if cached :


    path = pathlib.Path("./dump7.xml")

    with path.open('rt') as fp:
        response_text = fp.read()

else :

    for i in range(10) :
        try:
            response = amazon.ItemSearch(
                Keywords="フルメタル・パニック", SearchIndex="All",
                ResponseGroup=response_group
            )
            response_text = response.decode('utf-8')
            break
        except:
            print("got 503 error, retry!")
            sleep(3)
    else:

        sys.exit(-1)


    # path = pathlib.Path("./dump7.xml")
    filename = "./dump_" + datetime.today().strftime("%Y-%m-%d-%H%M%S") + ".xml"
    path = pathlib.Path(filename)

    with path.open( mode="wt") as op :
        op.write(response_text)

soup =  BeautifulSoup(response_text,'lxml') # lxmlを使う場合はタグ名はすべて小文字。

itemList = soup.find_all("item")

item_data = list()
for item in itemList :
    print(item.name)

    author_list = [node.text for node in item.find_all('author')]

    creator_list = list()

    for tag in item.find_all('creator'):
        role = tag['role']
        text = tag.text

        creator_list.append((text, role))

    # publisher or manufacture studi or author ? どれか
    temp = {
        'title' : item.find('title').text,
        'detail_url' : item.find('detailpageurl').text,
        'asin' : item.find('asin').text,

        'manufacturer' : item.find('manufacturer').text if item.find('manufacturer') else None, # Kindle版はManufacture なし？

        'authors' : author_list,
        'creators' : creator_list,

        'small_image_url' : item.find('smallimage').find('url').text if item.find('smallimage') else None,
        'medium_image_url': item.find('mediumimage').find('url').text if item.find('mediumimage') else None
    }

    if item.find('format') :
           temp['format'] = item.find('format').text
    if item.find('binding') :
            temp['binding'] = item.find('binding').text

    print(temp)

    item_data.append(temp)

# author, creatorは複数の可能性がある。

print(itemList)

print(item_data)