From 023c98aca42694607fefb7d8d7692b45830d9669 Mon Sep 17 00:00:00 2001 From: X3ZvaWQ Date: Wed, 14 Feb 2024 13:47:40 +0800 Subject: [PATCH] chore: readme --- README.md | 31 ++++++++++++++++++++++++++++++- README_zhcn.md | 32 ++++++++++++++++++++++++++++++++ rollup.config.mjs | 1 - 3 files changed, 62 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 9acb3f3..9d65e43 100644 --- a/README.md +++ b/README.md @@ -42,7 +42,7 @@ try { ### About data interaction -Regarding the issue of interacting between JavaScript objects and Lua tables, the previous solution before version 1.18.0 was to perform a one-time conversion, such as: +Regarding the issue of interacting between JavaScript objects and Lua tables, the previous solution before version 1.18.0 was to perform a one-time conversion, such as: ```js const obj = { name: 233 }; @@ -84,6 +84,35 @@ console.log(o.name); // 23333 lua.doStringSync('print(obj.name)'); // 23333 ``` +### About the issue of character encoding + +The internal strings in Lua do not have a special storage method like JavaScript. In a sense, they are similar to C and store pure byte arrays. This can lead to a situation where when exchanging string data between Lua and JavaScript, Lua will try to provide a char type pointer for the host environment to parse. However, wasm defaults to using UTF-8 encoding for parsing, which creates a problem. If the file read by Lua is encoded in GBK, it will result in irreversible garbled characters. + +How to solve this problem? The common approach is to first escape it into ANSI range characters within Lua through methods such as base64 or directly obtaining byte arrays, in order to prevent garbled characters. Then, when needed on the JavaScript side at the corresponding position for reading, perform translation. This translation can be done using `iconv.decode(Buffer.from(data,'base64'))`. + +However, this approach may also have issues because currently in this project if an object of non-plainObject type is attempted to be injected into the Lua environment, userdata inside Lua is actually a proxy that ultimately returns control back to JavaScript for any operation performed. For example: + +```js +const lua = await Lua.create() +class TestClass {} +const test = new TestClass() +lua.ctx.test = test; +await lua.doString(` +test.func = function(str) print(str) end +`) +``` + +At this time although `test.func` is defined inside Lua, since `test` is a proxied userdata, func goes through an additional layer of wrapping by JavaScript and becomes essentially a wrapper of a lua function - actually being just a JavaScript function. + +So what problems does this cause? The passed-in string cannot be escaped because no matter how func method is written inside it gets passed back for execution by JavaScript; and if the string has undergone any encoding other than UTF-8 in JS then it will become garbled. To solve this problem, either ensure that `test` inside Lua is a simple table, or at least not a proxy from JavaScript to prevent passing string parameters without processing back to the JS side. +However, if it's done this way and other functions within test have `this`, it will lead to confusion in terms of what `this` refers to and make modifying the table impossible. This is because other functions are in the JS environment but we certainly want them to be able to modify the value of the table when running these functions inside Lua. + +> I thought about whether it would be possible during pushTable, if the key or value is a function, assign it a `this` that has gone through proxy. If any operations on this are performed within the function then convert those operations into operations on the target table? The general idea seems feasible. Will write more later (x + +Another solution is to override the index and newindex metamethods for injected objects using `JsType.decorate`. When adding methods for objects in Lua, not only add methods for JavaScript-side objects but also add methods for Lua-side objects. When performing an index operation, first check if there are native methods or values in the Lua-side object; if so, directly return those native methods without obtaining JavaScript's function wrapper. This allows escaping GBK-encoded strings within native methods. + +For specific implementation details, please refer to [https://github.com/JX3BOX/jx3-skill-parser/blob/master/src/wasmoon-helper.ts](https://github.com/JX3BOX/jx3-skill-parser/blob/master/src/wasmoon-helper.ts) + ## CLI Usage Although Wasmoon has been designed to be embedded, you can run it on command line as well, but, if you want something more robust on this, we recommend to take a look at [demoon](https://github.com/ceifa/demoon). diff --git a/README_zhcn.md b/README_zhcn.md index 1d336af..458c383 100644 --- a/README_zhcn.md +++ b/README_zhcn.md @@ -36,3 +36,35 @@ console.log(o.name); // 23333 lua.doStringSync('print(obj.name)'); // 23333 ``` + +### 关于字符编码的问题 + +Lua内部的字符串并不像js一样有特别的储存方式,某种意义上他跟c是一样的,存储的是纯粹的字节数组,那么就会产生一种情况。当在lua和js之间进行字符串的数据交换的时候,lua会尝试给出一个char类型的指针供宿主环境解析,但是wasm会默认使用utf8编码进行解析,那么问题就来了。如果Lua读取的文件是gbk编码就会导致不可逆的乱码问题。 + +怎么解决这个问题呢?常见的方法是通过在lua里先base64或者直接获取字节数组的办法先将其转义成ansi范围内包含的字符,防止乱码。然后在js端对应位置需要读取的时候再进行翻译。这个翻译可以使用`iconv.decode(Buffer.from(data, 'base64'))`进行。 + +但是这种方式也可能存在问题,因为目前该项目,如果尝试往lua环境注入一个非plainObject的类型,在lua内的userdata实际上是一个代理,任何操作最后都会回到js端托管。比如 + +```js +const lua = await Lua.create() + +class TestClass {} +const test = new TestClass() + +lua.ctx.test = test; + +await lua.doString(` + test.func = function(str) print(str) end +`) +``` +此时`test.func`虽然是定义在lua内,但是由于test是一个被代理的userdata。func会经过一层js的包装,实际上它就变成了一个lua function的wrapper,实际上是一个js函数。 + +那么会产生什么问题呢?向它传入的字符串无法经过转义,因为func方法里面再怎么写,参数都会传递到js在进行执行,而字符串只要经过了js编码不是utf8就会乱码。要解决这个问题,要么就是确保lua内的test是一个简单的table。起码不是一个来自js的代理防止字符串参数不经处理就传递到了js端。 + +但是这样一来如果test的其它function内有this就会导致this的指向混乱,无法修改table。因为其他的function是在js环境的,但是在lua里面运行这些function我们肯定是希望能够修改table的值的。 + +> 想了想是不是可以在pushTable的时候,如果键或者是值如果是function,为其指定一个经过proxy的this。如果在function内操作了this将其操作转换为对目标table的操作?大概思路是可以的。以后再写(x + +另一个解决方法就是通过`JsType.decorate`重写注入对象的index和newindex元表,在lua内为对象添加方法的时候不单为js端的对象添加方法,也为lua端的对象添加方法,当进行index操作的时候,先检查lua端的对象是否有原生的方法或者值,有的话直接返回原生方法,不获取js的function wrapper。就可以在原生方法内对gbk编码的字符串进行转义。 + +具体实现可以参考 [https://github.com/JX3BOX/jx3-skill-parser/blob/master/src/wasmoon-helper.ts](https://github.com/JX3BOX/jx3-skill-parser/blob/master/src/wasmoon-helper.ts) diff --git a/rollup.config.mjs b/rollup.config.mjs index 9b46d20..a201c87 100644 --- a/rollup.config.mjs +++ b/rollup.config.mjs @@ -2,7 +2,6 @@ import typescript from '@rollup/plugin-typescript'; import json from '@rollup/plugin-json'; import dts from 'rollup-plugin-dts'; import copy from 'rollup-plugin-copy'; -import pkg from './package.json' assert { type: 'json' }; const production = !process.env.ROLLUP_WATCH;