Skip to content

Commit

Permalink
use session to scrape user/hashtag/music/trending feed
Browse files Browse the repository at this point in the history
  • Loading branch information
drawrowfly committed Dec 19, 2020
1 parent 14dbed9 commit 763a0ca
Show file tree
Hide file tree
Showing 13 changed files with 237 additions and 151 deletions.
191 changes: 97 additions & 94 deletions README.md

Large diffs are not rendered by default.

23 changes: 19 additions & 4 deletions bin/cli.js
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,10 @@ const startScraper = async argv => {
argv.fileName = argv.filename;
}

if (argv.session) {
argv.sessionList = [argv.session];
}

if (argv.historypath) {
argv.historyPath = argv.historypath;
}
Expand Down Expand Up @@ -71,10 +75,10 @@ const startScraper = async argv => {

yargs
.usage('Usage: $0 <command> [options]')
.example(`$0 user USERNAME -d -n 100`)
.example(`$0 trend -d -n 100`)
.example(`$0 hashtag HASHTAG_NAME -d -n 100`)
.example(`$0 music MUSIC_ID -d -n 50`)
.example(`$0 user USERNAME -d -n 100 --session sid_tt=dae32131231`)
.example(`$0 trend -d -n 100 --session sid_tt=dae32131231`)
.example(`$0 hashtag HASHTAG_NAME -d -n 100 --session sid_tt=dae32131231`)
.example(`$0 music MUSIC_ID -d -n 50 --session sid_tt=dae32131231`)
.example(`$0 video https://www.tiktok.com/@tiktok/video/6807491984882765062 -d`)
.example(`$0 history`)
.example(`$0 history -r user:bob`)
Expand Down Expand Up @@ -109,6 +113,9 @@ yargs
alias: 'h',
describe: 'help',
},
session: {
describe: 'Set session cookie value. Session is required to scrape user/trending/hashtag/music feed',
},
timeout: {
default: 0,
describe: 'Set timeout between requests. Timeout is in Milliseconds: 1000 mls = 1 s',
Expand Down Expand Up @@ -259,6 +266,14 @@ yargs
argv._[0] = 'getUserProfileInfo';
}

if (CONST.requiredSession.indexOf(argv._[0]) > -1) {
if (!argv.session) {
throw new Error(
'In order to scrape user/trending/music/hashtag feed you need to set authenticated session cookie value --session .Please read the github readme',
);
}
}

return true;
})
.demandCommand()
Expand Down
20 changes: 11 additions & 9 deletions examples/CLI/Examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,13 @@

## Terminal Examples

#### NOTE: in order to download user/music/trending/hashtag feed you need to set sessiok cookie value. Read more about it in the main Readme file: How to get/set session value

**Example 1:**
Scrape 300 video posts from the user {USERNAME}. Save post metadata to the CSV(-t csv) file

```sh
tiktok-scraper user USERNAME -n 300 -t csv
tiktok-scraper user USERNAME -n 300 -t csv --session sid_tt=asdasd13123123123adasda

Output:
CSV path: /bla/blah/USERNAME_1552945544582.csv
Expand All @@ -16,7 +18,7 @@ CSV path: /bla/blah/USERNAME_1552945544582.csv
Scrape 100 posts from the hashtag {HASHTAG_NAME}, download(-d) and save them to the ZIP(-z) archive, save post metadata to the JSON and CSV files (-t all)

```sh
tiktok-scraper hashtag HASHTAG_NAME -n 100 -d -z -t all
tiktok-scraper hashtag HASHTAG_NAME -n 100 -d -z -t all --session sid_tt=asdasd13123123123adasda

Output:
ZIP path: /bla/blah/HASHTAG_NAME_1552945659138.zip
Expand All @@ -28,7 +30,7 @@ CSV path: /bla/blah/HASHTAG_NAME_1552945659138.csv
Scrape 50 posts from the trends section, download(-d) them to a ZIP(-z) and save metadata to the CSV(-t csv) file

```sh
tiktok-scraper trend -n 50 -d -z -t csv
tiktok-scraper trend -n 50 -d -z -t csv --session sid_tt=asdasd13123123123adasda


Output:
Expand All @@ -40,7 +42,7 @@ CSV path: /bla/blah/tend_1552945659138.csv
Scrape 100 posts from a particular music ID (numeric ID from TikTok URL), download(-d) and save posts to the ZIP(-z) and metadata to the CSV(-t csv) files

```sh
tiktok-scraper music MUSICID -n 100 -d -z -t csv
tiktok-scraper music MUSICID -n 100 -d -z -t csv --session sid_tt=asdasd13123123123adasda

Output:
ZIP path: /bla/blah/music_1552945659138.zip
Expand All @@ -54,7 +56,7 @@ Download(-d) 20 latest video post from the user {USERNAME} and save the progress
- When executing same command next time scraper will only download newly posted videos

```sh
tiktok-scraper user USERNAME -n 20 -d -s
tiktok-scraper user USERNAME -n 20 -d -s --session sid_tt=asdasd13123123123adasda


Output:
Expand All @@ -65,7 +67,7 @@ Folder Path: /User/Bob/Downloads/USERNAME
Download(-d) 20 latest video post without the watermark(-w) from the trending feed

```sh
tiktok-scraper trend -n 20 -d -w
tiktok-scraper trend -n 20 -d -w --session sid_tt=asdasd13123123123adasda


Output:
Expand Down Expand Up @@ -113,7 +115,7 @@ tiktok-scraper history
Download(-d) 100 latest video post from the user {USERNAME}, save posts to the folder(without zip) and do not download metadata

```sh
tiktok-scraper user USERNAME -n 100 -d
tiktok-scraper user USERNAME -n 100 -d --session sid_tt=asdasd13123123123adasda


Output:
Expand All @@ -124,7 +126,7 @@ Folder Path: /Blah/Blah/USERNAME
Download(-d) 50 latest video post from the user {USERNAME}, save posts to the custom folder(without zip) path and save post metadata to the CSV file

```sh
tiktok-scraper user USERNAME -n 50 -d --filepath /User/Bob/Downloads -t csv
tiktok-scraper user USERNAME -n 50 -d --filepath /User/Bob/Downloads -t csv --session sid_tt=asdasd13123123123adasda


Output:
Expand All @@ -136,7 +138,7 @@ CSV path: /User/Bob/Downloads/USERNAME_1587898252464.csv
Download(-d) 20 latest video post without the watermark(-w) in HD(--hd) quality from the user {USERNAME}

```sh
tiktok-scraper user USERNAME -n 20 -d -w --hd
tiktok-scraper user USERNAME -n 20 -d -w --hd --session sid_tt=asdasd13123123123adasda


Output:
Expand Down
2 changes: 1 addition & 1 deletion examples/getHashtagFeed.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ import { hashtag } from '../src';

(async () => {
try {
const posts = await hashtag('summer', { number: 1, noWaterMark: true });
const posts = await hashtag('summer', { number: 1, sessionList: ['sid_tt=asdasd13123123123adasda;'] });
console.log(posts.collector);
} catch (error) {
console.log(error);
Expand Down
2 changes: 1 addition & 1 deletion examples/getMusicFeed.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ import { music } from '../src';

(async () => {
try {
const posts = await music('6548327243720952577', { number: 1, noWaterMark: true });
const posts = await music('6548327243720952577', { number: 1, sessionList: ['sid_tt=asdasd13123123123adasda;'] });
console.log(posts.collector);
} catch (error) {
console.log(error);
Expand Down
2 changes: 1 addition & 1 deletion examples/getTrendingFeed.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ import { trend } from '../src';

(async () => {
try {
const posts = await trend('', { number: 1, noWaterMark: true });
const posts = await trend('', { number: 1, sessionList: ['sid_tt=asdasd13123123123adasda;'] });
console.log(posts.collector);
} catch (error) {
console.log(error);
Expand Down
2 changes: 1 addition & 1 deletion examples/getUserFeed.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ import { user } from '../src';

(async () => {
try {
const posts = await user('tiktok', { number: 1, noWaterMark: true });
const posts = await user('tiktok', { number: 1, sessionList: ['sid_tt=asdasd13123123123adasda;'] });
console.log(posts.collector);
} catch (error) {
console.log(error);
Expand Down
1 change: 1 addition & 0 deletions src/constant/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ export = {
'userprofile',
],
history: ['user', 'hashtag', 'trend', 'music'],
requiredSession: ['user', 'hashtag', 'trend', 'music'],
sourceType: {
user: 8,
music: 11,
Expand Down
29 changes: 14 additions & 15 deletions src/core/TikTok.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ describe('TikTok Scraper MODULE(promise): user(valid input data)', () => {
noWaterMark: false,
type: 'user',
headers: {
'User-Agent': 'Custom User-Agent',
'user-agent': 'Custom user-agent',
},
proxy: '',
number: 5,
Expand All @@ -33,7 +33,7 @@ describe('TikTok Scraper MODULE(promise): user(valid input data)', () => {

it('set custom user-agent', async () => {
expect(instance).toBeInstanceOf(TikTokScraper);
expect(instance.headers['User-Agent']).toContain('Custom User-Agent');
expect(instance.headers['user-agent']).toContain('Custom user-agent');
});

it('getUserId should return a valid Object', async () => {
Expand All @@ -46,7 +46,6 @@ describe('TikTok Scraper MODULE(promise): user(valid input data)', () => {
minCursor: 0,
secUid: '',
sourceType: 8,
user_agent: 'Custom User-Agent',
verifyFp: '',
});
});
Expand All @@ -69,7 +68,7 @@ describe('TikTok Scraper MODULE(event): user(valid input data)', () => {
input: 'tiktok',
type: 'user',
headers: {
'User-Agent': 'Custom User-Agent',
'user-agent': 'Custom user-agent',
},
proxy: '',
number: 1,
Expand Down Expand Up @@ -148,7 +147,7 @@ describe('TikTok Scraper MODULE(promise): user(invalid input data)', () => {
input: '',
type: 'user',
headers: {
'User-Agent': 'okhttp',
'user-agent': 'okhttp',
},
proxy: '',
number: 5,
Expand All @@ -166,7 +165,7 @@ describe('TikTok Scraper MODULE(promise): user(invalid input data)', () => {
input: '',
type: 'fake' as ScrapeType,
headers: {
'User-Agent': 'okhttp',
'user-agent': 'okhttp',
},
proxy: '',
number: 5,
Expand All @@ -186,7 +185,7 @@ describe('TikTok Scraper MODULE(event): user(invalid input data)', () => {
input: '',
type: 'user',
headers: {
'User-Agent': 'okhttp',
'user-agent': 'okhttp',
},
proxy: '',
number: 1,
Expand All @@ -209,7 +208,7 @@ describe('TikTok Scraper MODULE(event): user(invalid input data)', () => {
input: '',
type: 'fake' as ScrapeType,
headers: {
'User-Agent': 'okhttp',
'user-agent': 'okhttp',
},
proxy: '',
number: 5,
Expand Down Expand Up @@ -238,7 +237,7 @@ describe('TikTok Scraper MODULE(promise): user(save to a file)', () => {
input: 'tiktok',
type: 'user',
headers: {
'User-Agent': 'okhttp',
'user-agent': 'okhttp',
},
proxy: '',
number: 5,
Expand Down Expand Up @@ -273,7 +272,7 @@ describe('TikTok Scraper MODULE(promise): hashtag(valid input data)', () => {
input: 'summer',
type: 'hashtag',
headers: {
'User-Agent': 'okhttp',
'user-agent': 'okhttp',
},
proxy: '',
number: 5,
Expand Down Expand Up @@ -308,7 +307,7 @@ describe('TikTok Scraper MODULE(promise): signUrl', () => {
input: 'https://m.tiktok.com/share/item/list?secUid=&id=355503&type=3&count=30&minCursor=0&maxCursor=0&shareUid=&lang=',
type: 'signature',
headers: {
'User-Agent': 'okhttp',
'user-agent': 'okhttp',
},
proxy: '',
number: 5,
Expand Down Expand Up @@ -338,7 +337,7 @@ describe('TikTok Scraper MODULE(promise): getHashtagInfo', () => {
input: hasthagName,
type: 'single_hashtag',
headers: {
'User-Agent': 'okhttp',
'user-agent': 'okhttp',
},
proxy: '',
number: 5,
Expand Down Expand Up @@ -389,7 +388,7 @@ describe('TikTok Scraper MODULE(promise): getUserProfileInfo', () => {
input: userName,
type: 'single_user',
headers: {
'User-Agent': 'okhttp',
'user-agent': 'okhttp',
},
proxy: '',
number: 5,
Expand Down Expand Up @@ -459,7 +458,7 @@ describe('TikTok Scraper CLI: user(save progress)', () => {
input: 'tiktok',
type: 'user',
headers: {
'User-Agent': 'okhttp',
'user-agent': 'okhttp',
},
proxy: '',
number: 5,
Expand Down Expand Up @@ -497,7 +496,7 @@ describe('TikTok Scraper MODULE(promise): getVideoMeta', () => {
input: 'https://www.tiktok.com/@tiktok/video/6807491984882765062',
type: 'video_meta',
headers: {
'User-Agent': CONST.userAgent(),
'user-agent': CONST.userAgent(),
},
proxy: '',
number: 5,
Expand Down
Loading

0 comments on commit 763a0ca

Please sign in to comment.