A PostgreSQL's Foreign Data Wrapper (FDW) for retrieving the movie ranking data via the public API of douban.com. This FDW is written mainly in GO(cgo).
This toy FDW was inspired by the following github repository
-
make sure the path to the PostgreSQL's binary had been exported to the
PATH
environment variable because it is necessary to use thepg_config
command for the source build.please use
which pg_config
to check the path -
make sure the GO language was successfully installed and the path to
go
had been exported to thePATH
environment variableplease use
go version
to check that -
make sure the toolchain of
gcc
andmake
were successfully installed -
execute the following command
$cd /path/to/douban_fdw $make $make install
if the commands above successed, the shared library files would be installed into
/path/to/postgres/install/lib
and the other files would be installed into/path/to/postgres/install/share
-
restart the postgres instance
-
use the postgres client (such as
psql
) to connect to the postgres instance and execute the following SQL statement and it should be done by the superuserCREATE EXTENSION douban_fdw;
-
first of all, create a foreign server like
CREATE SERVER {servername} FOREIGN DATA WRAPPER douban_fdw;
-
define a foreign table with the foreign server above
CREATE FOREIGN TABLE {tablename} (rating {data type}...) SERVER {servername} OPTIONS (rank_name 'top250');
you can name the foreign table whatever you wanted to, but the column name should be as follows. if you defined an column name out of the valid range, it would cause an error when you queried the table
the column names which can be identified:
- casts
- collectcount
- directors
- genres
- id
- originname
- rating
- subtype
- title
- url
- year
-
use the SELECT statement to query the foreign table defined above
-
this FDW currently can work with PostgreSQL 9.5 only, because the internal fdw interfaces changed
-
the foreign table defined by douban_fdw can work properly only on the database of which the encoding being UTF8, because most of the data retrieved from douban.com are simplified chinese characters (encoded in UTF8)
-
according to the official manual, the douban's public api can be called only 40 times per-api-per-hour for one ip address
the following features are on the way
- the implementation of the rescan routine
- a local persistant buffer to solve the times limit issue of the douban API
- server-side encoding convert to support the database of which not being UTF8-encoded
- PostgreSQL 9.6+ support