What Happens When You Decompile TikTok’s Web SDK? This.

Tiktok VM Reverse Engineering (WebmsSDK.JS)
This project is for reverse engineering the Tiktok Virtual Machine (VM).
General -ideya
Tiktok uses a custom virtual machine (VM) as part of obfuscation and security layers. This project includes tools at:
- Deobfuscate
webmssdk.js
with a virtual machine. - Decay Instructions on the Virtual Machine of Tiktok in the readable form.
- The script is injected Replace WebmsSDK.js with deobfuscated VM.
- Sign URL Develop signed URLs that can be used to make author-based requests Ex. Post comments.
Deobfuscating
When viewing webmsssdk.js you met a heavy file. The basic method of obfuscating javascript is to take advantage of the bracket notation that lets you endex a variable using another variable.
So when you see something like this:
// Line 3391 of ./deobfVersions/raw.js
r[Gb[301]](Gb[57], e))
You have no idea what index it is.
Every use of this method uses an array Gb
specified as
var Gb = ["ydTGHdFNV", "sNxpGNHMrpLV", "xyrNMLEN Fpp rpMu", "ydWyNe", ...].map(function(a) {
return a.split("").map(function(c) {
return "LsfVNxutyOcrEMpYAGdFHneaUKRXSgoJDbhqICzPZklivTmWBwQj".indexOf(c) == -1 ? c : "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"["LsfVNxutyOcrEMpYAGdFHneaUKRXSgoJDbhqICzPZklivTmWBwQj".indexOf(c)]
}).join("")
});
As you can see we can't read even if it's all that is scodied using this string "LsfVNxutyOcrEMpYAGdFHneaUKRXSgoJDbhqICzPZklivTmWBwQj"
.
Since this code is executed immediately we can do this snippet and run it on any console and get:
[
"isTrusted",
"beforeunload",
"filename too long",
"isView",
...
]
We can now see each of these strings, therefore we can use Regex to go through the script and replace all the use of array as seen here it will also restore the bracket notation to the readable dot notation.
After that we left webmsdk1.
The example from above now looks like this
r.addEventListener("abort", e),
Better.
Another significant obfuscation method used is for disguising function calls.
Each operation is defined in an array Ab
.
var Ab = [function(e) {
return "[object Array]" === Object.prototype.toString.call(e)
}
, function(e) {
return e && e.__esModule && Object.prototype.hasOwnProperty.call(e, "default") ? e.default : e
}
, function() {
var Ga;
Ga = [0, 1],
(je = !Ga[0],
le && (setTimeout(function() {
document.dispatchEvent(new Event(pe))
}, Ga[1]),
document.removeEventListener("DOMContentLoaded", Ab[40]),
document.removeEventListener("readystatechange", Ab[75])))
}
...]
And used it by calling Ab[index](args)
Such as:
Ab[31](f[e], t, n, i)
When using common ideas if we click on this function it will only bring us to the start of the array that makes it difficult to keep track of what function calls are calling what function.
We can make it read by:
-
The array was taken
-
Replace each other with the element of operating with its own standard functioning calling it
function Abindex(args)
-
Replace each call with
Ab[index](args)
includedAbindex(args)
We can do this by using the AST form of the script by bapel as seen here
That gives us this.
The virtual machine is part of the script, especially if the bytecode implementation is a nested if other statement is seen here.
This is actually a normal switch case but has been well disguised. After doing some of the cases, AI helped me and do the rest. Which gave me this, which seems to be the standard for a bytecode VM.
When it was a virtual machine later and seeing which functionality it used, I was able to say what was being done and changed some of the VAR names.
After all this and some smaller obfuscation techniques here are the latest version of the file.
Decrypting bytecode
Using the file that is completely deobfuscated, thinking that the operation is easier, I can easily find how VM started here.
Bytecode is stored as a long string that is all that -xor'ed with a key lying inside the string.
// Line 3046 of latestDeobf.js
// Getting XOR key
for (var t = atob(payload), r = 0, n = 4; n < 8; ++n) r += t.charCodeAt(n);
// Decryping bytecode
unZip(Uint8Array.from(t.slice(8), XOR, r % 256), { i: 2 }, t && t.out, t && t.dictionary),
// Extracting strings, functions and metadata for each function
for (var n = leb128
i = leb128
for (o = 0; o < i; ++o) {
for (var argsLength = leb128
for (var instructions = new Array(), h = leb128
instructionSets.push([instructions, argsLength, isStrictMode, exceptionHandlers]);
}
Note: The string is GZIP-Ed and each value is Leb128 that is encoded same for compression
Virtual Machine Decompiling
Tiktok uses an entire bytecode VM, if you browse it, it supports ranges, nested functions and handling exclusion. This is not a typical VM and shows that it is certainly sopic.
To write a decay form I just went through each of the cases and wrote the appropriate code for each, and any case that jumped into a different position for loops like this:
case 2:
var a = instructions[index++];
stack[pointer] ? --pointer : index += a;
break;
I'll just stop it from doing this:
case 2:
var a = instructions[index++];
//stack[pointer] ? --pointer : index += a;
addCode(`// if (!v${pointer}) skip ${a} to ${index + a}`, byteCodePos)
break;
After doing this for all cases I discarded each file here. This is not fully read but you should create an overview of what each function is doing, for example VM223 that generates random characters.
Pag -Debug
Since it is a JavaScript file conducted on the web, it is possible to replace the normal webmssdk.js
Using a deobfuscated file and use a Tiktok normally.
This can be achieved by using two browser extensions known as Tampermonkey for the implementation of custom code and CSP to disable CSP so I can get files from air -locked sources. This is so I can put latestDeobf.js
On my own file server and get it every time, I can easily edit the file and let the changes take place every time I refresh. It makes it easier to do it when the operations are reversed.
The script can be found here
Requests
Now that we have been –deobfuscated the file and decaying VM we can start to reverse any function we want and find out what it does.
When you make a request to the server it usually consists of 3 additional headers.
Header |
Description |
---|---|
|
Sent by server and reissued with each request. |
|
Formed by webmsssdk.js based on request. |
|
Formed by webmsssdk.js based on request. |
When making a request that does not require authentication such as questioning a user. Just X-Bogus
is necessary to form that can be made with window.frontierSign
. _signature
unnecessary and whatever msToken
can be used.
This popular oppressor will make you those requests. It uses a webdriver library called the playwright, which sets just an instance of the browser, so it's easy to call window.frontierSign
.
When it comes to making authentication requests such as posting comment, _signature
is required and not exposed to window
.
Signer
The Call of Interview Call for each request is VM86 that then calls
VM113 for X-bogus
VM189 for _signature
I was able to write a signer that successfully signed the URL.
Here's a demo posting a feedback and checking it with a private browser to make sure it is successful.
Note: There are also some bots of bot protection such as mouse tracking (VM120) and environmental review (VM265) within VM86, but this is a fully client check and does not talk to the server about, so it can be ignored when developing signatures.
- Note: Tiktok VM is constantly changing new releases. There is a high chance the main algorithms will change and rot the new VM is required.