Talk:GitHub Copilot

This is the talk page for discussing improvements to the GitHub Copilot article.
This is not a forum for general discussion of the subject of the article.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

Software: Computing Low‑importance

	This article is within the scope of WikiProject Software, a collaborative effort to improve the coverage of software on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.SoftwareWikipedia:WikiProject SoftwareTemplate:WikiProject Softwaresoftware
Low	This article has been rated as Low-importance on the project's importance scale.
	This article is supported by WikiProject Computing.

Microsoft Low‑importance

	This article is within the scope of WikiProject Microsoft, a collaborative effort to improve the coverage of articles relating to Microsoft on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.MicrosoftWikipedia:WikiProject MicrosoftTemplate:WikiProject MicrosoftMicrosoft
Low	This article has been rated as Low-importance on the project's importance scale.

Copilot is good for what it does, why is this article so negative?

I'm a software engineer using Copilot daily -- it's pretty good. I believe the article should explore more about what Copilot can and can't do in addition to just listing out controversies. 2603:6080:3F40:270B:98C6:98FC:EC03:DF30 (talk)

Well there's a huge problem with it being cloud based instead of running locally. It hands over all our data to a service that we have no control over. It's likely in a couple of years (maybe a decade) law enforcement might start collecting the data in order to fingerprint developers, so every snippet of code on the Internet will be traceable, if we all end up all using this service. Definitely intelligence agencies will be doing this. Once the data is there, people in power will get their hands on it, it is only a matter of time, this has been proven time and time again. And this has implications for people writing controversial software e.g. in the cryptocurrency industry. It will at the very least cause a chilling effect. 86.130.92.69 (talk) 09:39, 18 October 2022 (UTC)[reply]

Privacy and Ethical concerns

I've added a section on the privacy and ethical concerns of switching to what is essentially a cloud editor, where every single keystroke is logged to the cloud. Also a mention of offline alternatives would be useful, even if those are poor alternatives at present. — Preceding unsigned comment added by 86.130.92.69 (talk) 09:35, 18 October 2022 (UTC)[reply]

And there will be a large flow of analytics data which no doubt is going to be mined and psychological profiles built from that, in a worst case scenario.

So basically it's turning our private computers into terminals and if it's usage becomes widespread in the industry we will have to submit to this otherwise we will become uncompetitive. It is yet another big expansion of the government and corporate power over society as more and more things become online by default.

The Free Software Foundation has more information on the dangers of doing this, which start to become apparent after roughly a decade of widespread usage. ^[1] 86.130.92.69 (talk) 10:00, 18 October 2022 (UTC)[reply]

Should we write about the whole leaking secrets fiasco?

Many people are worried (mostly on Twitter) that GitHub Copilot can leak secrets.

There has already been at least one report of secrets being leaked^[2], however it is still unclear whether they were real.^[3]

Peter Placzek (talk) 19:26, 7 July 2021 (UTC)[reply]

I’m unsure if Twitter posts are enough to warrant addition here. 🐶 EpicPupper ^{(he/him | talk, FAQ, contribs | please use {{ping}} on reply)} 21:12, 7 July 2021 (UTC)[reply]

References

^ "Who does that server really serve?". gnu.org. Retrieved 18 Oct 2022.
^ "Archived tweet talking about the report". Archive.org. Retrieved 7 July 2021.
^ "Same user talking about why he deleted the tweet". Twitter. Retrieved 7 July 2021.

Kuhn (2022) article

I have not got the expertise to add this, but it might be relevant? RobbieIanMorrison (talk) 20:21, 8 February 2022 (UTC)[reply]

Kuhn, Bradley M (3 February 2022). "If software is my Copilot, who programmed my software?". Software Freedom Conservancy. New York, USA. Retrieved 2022-02-08.

According to the FSF it certainly is. This is one of the 5 papers underscored by an anonymous review. --Palosirkka (talk) 08:20, 14 March 2022 (UTC)[reply]

I added a sentence about the SFC dropping GitHub and their reasoning in that post. Wqwt (talk) 04:34, 8 September 2022 (UTC)[reply]

Wiki Education assignment: WRIT 340E Spring 2022

This article was the subject of a Wiki Education Foundation-supported course assignment, between 10 January 2022 and 29 April 2022. Further details are available on the course page. Student editor(s): Reereef, JackMandelkorn, CarlosDanielRLao (article contribs).

Litigation beginning November 2022

Formal litigation started in early November 2022. A web search on "github copilot litigation" will throw up quite a few good sources. There is even a dedicated website: githubcopilotlitigation.com. Sorry but I have not the time to write this up. RobbieIanMorrison (talk) 21:37, 30 November 2022 (UTC)[reply]

Release date in info box incorrect?

The infobox states that Copilot was released in October 2021. However, the text mentions that it was released for technical preview in June. In October additional extensions for Neovim and Jetbrains IDEs had been released, but I could not find any credible references mentioning anything else happening in October. I suggest, that this release date is either changed to June 2021, when the technical preview went live or June 2022, when GitHub announced that Copilot would be available for all developers. What do you think? 87.163.207.60 (talk) 12:31, 31 January 2024 (UTC)[reply]

About Ethical concerns

Maybe we should add that as a countermasure for AI crawling git repos to "learn" sourcecode against the will of the authors one could intentionally provide repos with fake-sourcecode which is full of errors and bad behavior to fool those AI crawling attempts. You break fair use, we brake your attempts to learn my code.

This is more of a hypothetical thing, but nothing in the way to actually implement it easily and fast. Some kind of nonsense-programming culture may arise out of it, to fight AI and protect intellectual property. Time will tell. 2003:D5:7F2C:7F00:D708:2986:913A:ACA8 (talk) 08:41, 27 October 2024 (UTC)[reply]

Removal of content on the security of generated code

I had concerns about the usefulness for readers of some content that was recently added to the "Reception" section by new users.

First, it seems to me that including the outcomes of a 2021 study gives an outdated view of Copilot's current capabilities: most readers are likely interested in Copilot current capabilities, not about its capabilities in 2021. For those readers, it's unclear what to infer from a 2021 study. A sentence in the article argues "Even though this study was in 2021, it is still relevant to this day as training data is the back bone of every large language model and is one of the biggest factors in how it responds." That makes sense, but nowadays, LLMs are also fine-tuned, and their performances in software development have drastically improved (and by the way, this kind of unsourced argumentative sentence, and others in the "Security" subsection, shouldn't be in the article). Even the study "Lost at C" seems to have been made on an old model, likely based on GPT-3.

Secondly, I think the content is overly verbose, even though it's well-written. The "Security" subsection is quite long, and yet after reading it, it's still not much clearer whether that means that LLMs are rather good or bad in terms of security. Even with context, it's quite difficult to meaningfully interpret the results of this kind of individual study (GPT-3 is a very outdated model, and on the other side the test involved narrow well-defined coding tasks rather than modifying complex codebases...). That content could be interesting to an audience of researchers on the topic, but it looks too detailed and complex to interpret for a general audience. There is also redundancy between the content at the beginning of the "Reception" section and the content in the "Security" subsection.

Also, I didn't find a good source for the paragraph about Verilog. There are also issues with the formatting and the placement of the references.

So I deleted the content. Alenoach (talk) 21:54, 29 November 2025 (UTC)[reply]

[1] "Who does that server really serve?". gnu.org. Retrieved 18 Oct 2022.

[2] "Archived tweet talking about the report". Archive.org. Retrieved 7 July 2021.

[3] "Same user talking about why he deleted the tweet". Twitter. Retrieved 7 July 2021.

[1]