Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Amplitude tracking from this package #54

Open
AndrewRayCode opened this issue Apr 20, 2021 · 7 comments
Open

Remove Amplitude tracking from this package #54

AndrewRayCode opened this issue Apr 20, 2021 · 7 comments

Comments

@AndrewRayCode
Copy link

AndrewRayCode commented Apr 20, 2021

Describe the solution you'd like
Could you please remove amplitude tracking from this package and remove the --noinsight flag? I'm happy to submit a PR to remove it myself if you're ok with it. This behavior is not obvious and a potential security risk for people using this package for private build tools. Npm gives you download statistics which are useful for understanding package usage.

@nvuillam
Copy link
Owner

nvuillam commented Apr 20, 2021

It's just anonymous stats, administrated by myself only, so where is the security risk ? And if you really think there is one, you can indeed use --noinsight
You can see the content of what is sent , there is no IP, no code... just usage stats, and it allowed me to know which commands are more used so I can focus on them (npm stats it for the package, it's finer with amplitude)

By the way, sfdx-cli collects all stats and I'm sure they are more detailed than sfdx-essentials ones ;)

@AndrewRayCode
Copy link
Author

I'm not sure how to convince you that sending operating system information and environment variables(!!) and to an unexpected HTTP endpoint during a script run is a major security issue. sfdx is the standard tool for modern CICID deployment, so now this tool is collecting infrastructure information about deployment systems that are potentially communicating with production systems. The reporting is also a dark pattern, because you'd only discover it by reading every single command line flag for a command, and it's on by default. I only noticed it because a firewall blocked the access and I investigated the error. I feel worried others would miss this behavior and use this plugin in production facing systems, without realizing it has the potential to leak sensitive information and expose internals of company infrastructures.

Thank you for pointing out the telemetry gathered by sfdx itself, I didn't realize that. The sfdx tool has so many problems and doesn't follow the unix design philosophy. It's definitely not an example about how to write open source software (it's not really even open source I guess).

I am happy to submit a PR to remove this behavior fully, or optionally have an opt-in flag to enable amplitude reporting which isn't a dark pattern. Is that a PR you'd accept?

@nvuillam
Copy link
Owner

What about adding some message when installing the plugin, telling that anonymous usage stats are collected, and that if you don't want that you can either use --no-insight, or even use some env variable disabling stats ?

@AndrewRayCode
Copy link
Author

That still smells like a dark ui pattern to me as the common use case for post install message is asking for donations so most people ignore them

@nvuillam
Copy link
Owner

nvuillam commented Apr 21, 2021

Here is an event I send to amplitude (they all have the same structure/content)
What do you identify as a security risk within ?
As you can see, there is of course no ENV variable, I use stats to enhance the tool, i'm not a hacker lol ^^ (by the way this event is from the CI/CD used for one of my clients)

{
  "$insert_id": "e39b780a-ab2d-45a0-8d16-b91f13ef5a30",
  "$row_source": "realtime",
  "$schema": 12,
  "_time": 1619043568763,
  "adid": null,
  "amplitude_attribution_ids": null,
  "amplitude_event_type": null,
  "amplitude_id": 253621241803,
  "app": 286940,
  "city": null,
  "client_event_time": "2021-04-21T22:19:28.763",
  "client_upload_time": "2021-04-21T22:19:28.763",
  "country": null,
  "data": {
  },
  "device_brand": null,
  "device_carrier": null,
  "device_family": null,
  "device_id": "8f0aaf37-e87e-5d42-8fbd-69b05a4e8f65",
  "device_manufacturer": null,
  "device_model": null,
  "device_type": null,
  "display_name": "command",
  "dma": null,
  "event_id": 770875004,
  "event_properties": {
    "app": "sfdx-essentials",
    "appVersion": "2.7.0",
    "ci": true,
    "command": "essentials:packagexml:remove",
    "elapsedTimeMs": 125.9007015228,
    "options._": [
    ],
    "options.outputfile": "/tmp/sfdx-hardis-82huti/mainPackage.xml",
    "options.packagexml": "/tmp/sfdx-hardis-82huti/mainPackage.xml",
    "options.removepackagexml": "/builds/busalesforce/cermix-client/cermix/manifest/splits/packageXmlSharingRulesOpportunity.xml",
    "osPlatform": "linux",
    "osRelease": "5.4.0-65-generic",
    "statsVersion": 2
  },
  "event_time": "2021-04-21T22:19:28.763",
  "event_type": "command",
  "group_properties": {
  },
  "groups": {
  },
  "idfa": null,
  "ip_address": "127.0.0.1",
  "is_attribution_event": false,
  "language": null,
  "library": "amplitude-node/0.3.3",
  "location_lat": null,
  "location_lng": null,
  "os": "linux 5.4.0-65-generic",
  "os_name": "linux",
  "os_version": "5.4.0-65-generic",
  "paying": null,
  "platform": "Node.js",
  "region": null,
  "sample_rate": null,
  "server_received_time": "2021-04-21 22:19:28.763000",
  "server_upload_time": "2021-04-21T22:19:28.778",
  "session_id": -1,
  "start_version": "2.7.0",
  "timeline_hidden": false,
  "user_creation_time": "2021-04-21T22:19:16.481",
  "user_id": "acbd6113-f505-4edf-808f-0d9006c76bf3",
  "user_properties": {
  },
  "uuid": "a541ca3f-a2ef-11eb-805f-068cf432ab8d",
  "version_name": "2.7.0"
}

@AndrewRayCode
Copy link
Author

AndrewRayCode commented Apr 21, 2021

If you're asking for a specific attack vector, I think you're overlooking general software security. The more a third party knows about a system, the higher chance a third party has of infiltrating it. Leaking an os version could help show what vulnerabilities are available. Leaking what software is installed has the same potential. Leaking when certain scripts are run could help an attacker understand a deploy process or improve their phishing ability. Or what deploy tooling a company is using for phishing or an attack vector. There are file paths, dates, and I have no way of knowing if Amplitude scrubs real referrer IPs. And most secrets get put into environment variables, which this package can leak, that was disturbing to see in the source.

As you can see, there is of course no ENV variable

I don't understand, did you not write the source code of the Amplitude tracking code that you linked me to? It "of course" references environment variables. Maybe you aren't aware of what process.env does?

language: process.env.LANG || process.env.LANGUAGE || process.env.LC_ALL || process.env.LC_MESSAGES,
data.ci = process.env.CI ? true : false

Again I'm not sure how to help you understand that sending someone's infrastructure information to the public internet is an undesirable bad behavior.

I now view this package as a bad actor, which is my problem for trying to use it, not yours. The functionality I need is small enough to fork. I'm not sure what you get, if anything, out of having amplitude events, but this doesn't seem like something you're willing to take on.

@nvuillam
Copy link
Owner

nvuillam commented Apr 22, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants