anhinga_anhinga | Recent Entries

"Evaluating Large Language Models Trained on Code", arxiv.org/abs/2107.03374

A detailed description of early (pre-GitHub Copilot) versions of OpenAI Codex. This is the "paper of the year" so far: we finally have real progress in AI-assisted computer programming (and difficulties of computer programming form the key bottleneck limiting the speed of progress).

See comments in dmm.dreamwidth.org/44860.html for details.

JuliaCon 2021 starts on July 20 with 8 days of workshops followed by 3 days of main conference. JuliaCon 2020 was great, this is likely to be even better.

This is a fully virtual conference for the second year in a row; the registration is free and needed to access interactive features, poster sessions, and such, but the bulk of materials will be accessible via YouTube without registration. I created a post dmm.dreamwidth.org/46160.html with links and I'll keep populating it with various comments as the conference progresses.

Cross-post: anhinga-anhinga.livejournal.com/85003.html

On May 28, 2020 OpenAI published the GPT-3 paper, "Language Models are Few-Shot Learners", arxiv.org/abs/2005.14165

This was the "AlexNet moment of the Transformer Revolution", and the qualitative jump was even more significant than the AlexNet 2012 jump.

One extremely strange and remarkable property of GPT-3 is that purely linguistic knowledge in this model is often sufficient to guess a piece of correct computer code from a natural language description of a problem (even though we don't think this model "truly understands programming").

There was already quite a boom in these novel models (invented as recently as 2017), after BERT and GPT-2, but now the field had just exploded: "efficient transformers", "vision transformers", "multimodal transformers", etc.

And tons of interesting work were done in hybrid models which combined transformers and other attention-based models with all kinds of other techniques. Hybrids of all kinds of methods with transformers and other attention-based models are probably the future. For example, the famous Alpha Fold 2 by DeepMind which "solved" protein folding in November was a hybrid model with an attention-based component at its center: deepmind.com/blog/article/alphafold-a-solution-to-a-50-year-old-grand-challenge-in-biology

All this means two things: 1) "True AI" can emerge any moment; we are seeing a flood of breakthroughs now, and one of them (or a short sequence of them) might result in a much more radical shift than anything we've seen so far. I don't know if it happens this year, but it's a possibility (from invention of convolutional neural nets in 1989 it took more than 20 years to AlexNet, from invention of Transformers in 2017 it took only 3 years to GPT-3, things can really start happening very fast).

2) If you are a practitioner in any field (especially if this field is machine learning of some kind), it makes sense to ponder hybrids between your favorite methods and "attention" (which is just a linear combination of high-dimensional vectors [sometimes with all coefficients being non-negative and summing up to 1]), hybrids between your favorite methods and matrix multiplication (which is just a way to compute a lot of linear combinations of high-dimensional vectors rapidly), hybrids between your favorite methods and Transformers (a certain way of arranging those matrix multiplications and interleaving them with modest neural connectors). This is likely to be a very fruitful thing, and this is how you can supercharge your favorite methods and produce novel results.

Cross-post: anhinga-anhinga.livejournal.com/84392.html

Julia is an unusual language. It is based around the idea of "eating your cake and having it too, again and again". Flexible and very fast at the same time, friendly readable syntax and Lisp-strength macros and multiple dispatch, etc:

https://julialang.org/blog/2012/02/why-we-created-julia/

Julia Flux is trying to become the next generation machine learning framework, and is also characterized by this approach of "eating your cake and having it too". If TensorFlow 1.0 is the past, and PyTorch is the leading state-of-the-art framework of the present, Julia Flux is quite likely to become the machine learning framework of the future; see the first comment in this blog post for details:

https://dmm.dreamwidth.org/23453.html

Does anyone here use Julia, or does anyone here knows someone who uses Julia?

Crosspost: anhinga-anhinga.livejournal.com/84046.html

I hope for us all to have a creative and safe New Year.

Computer art news: I started to play with OpenGL shaders and with Shadertoy.com: dmm.dreamwidth.org/20076.html

Other news: books and stories, my own texts, open source activity and software experiments, employment change, etc: dmm.dreamwidth.org/23061.html

I generally shifted quite a bit towards dreamwidth and this blog during this year, and away from livejournal; this was not planned, but just happened "organically". Most of my activity this year was at dmm.dreamwidth.org

Crosspost: anhinga-anhinga.livejournal.com/83885.html

Two series of experiments with self-referential neural nets with vector flows ("dataflow matrix machines") were done by us in 2018.

The ability of a neural net to modify itself on the fly was used to edit it interactively while it is running ("livecoding"). This also opens the way to have populations of neural nets editing each other.

Emerging "sleep-wake" behavior and other emerging bistability patterns were observed in randomly initialized neural nets (May 2019 update: a couple of video recordings of those behaviors are posted: https://youtu.be/_mZVVU8x3bs and https://youtu.be/CKVwsQEMNjY ). There is no theoretical understanding of this emerging dynamics yet.

( Details )
Crosspost: anhinga-anhinga.livejournal.com/83697.html

Other blogs by this author:

https://dmm.dreamwidth.org/ (partial mirror: https://anhinga-travel.livejournal.com/ )
https://anhinga-drafts.livejournal.com/ (mirror: https://anhinga-drafts.dreamwidth.org/ )

Continuing the last couple of posts

https://anhinga-anhinga.livejournal.com/82953.html

and

( Read more... )

A year ago I posted about dataflow programming and linear models of computation:

http://anhinga-anhinga.livejournal.com/82757.html

It turns out that those dataflow matrix machines are a fairly powerful generalization of recurrent neural networks.

( Read more... )

Watch on YouTube

https://www.youtube.com/watch?v=fEWcg_A5UZc

More on linear models of computation:

( Read more... )

I am using the friends list here mosly to create a useful Friends page.

And this is my Friends of Friends page.

I filter my default Friends page somewhat since December 2005. This is my unfiltered Friends page. This is an "express selection" (about 20% of the volume).

Russian Virtual Keyboard by Paul Gorodyansky. Online translation by translate.ru. All tags of this journal.

LiveJournal notifications about new friends or new comments work only in some cases.

If I did not notice something, please leave a comment in the post below this one.

This is a prototype of an elderly home care robot developed by a very small group at IBM (with the benign indifference from their employer corporation that does not want to deal with robots and headaches and liabilities associated with robots). Its ability to verbally communicate with a human and to learn from a human is very impressive (basically, one can program this robot to a large extent simply by talking to it). Here is a demo video from the talk at the AGI-12 conference:

http://www.youtube.com/watch?v=M2RXDI3QYNU

The paper itself, "An Extensible Language Interface for Robot Manipulation", explaining to some extent how this works can be found here:

http://www.mindmakers.org/boards/18/topics/73

and the free online version of AGI-12 proceedings is here (scroll down to AGI-12 Contributed Paper Sessions):

http://www.mindmakers.org/projects/agiconf-2012/wiki/Schedule

I just went there for the second time:

http://www.bc.edu/bc_org/avp/cas/artmuseum/exhibitions/archive/klee/index.html

http://www.bc.edu/bc_org/avp/cas/artmuseum/visitor-information/index.html

Devlin Hall, Monday-Friday 11:00-4:00, Saturday and Sunday 12:00-5:00, free

"The AGI conferences are the only major conference series devoted wholly and specifically to the creation of AI systems possessing general intelligence at the human level and ultimately beyond." This is a small conference (I expect around 200 people, give or take), and this year it comes in two parts, AGI-12 and AGI-Impacts:

http://agi-conference.org/2012/schedule/

http://www.winterintelligence.org/#calendar

The videos from the AGI-11 are available here:

http://agi-conf.org/2011/

There are two ballot questions in Massachusetts this year aiming to improve the human rights situation in the state:

http://www.sec.state.ma.us/ele/ele12/ballot_questions_12/quest_3.htm

ballotpedia entry

***

http://www.sec.state.ma.us/ele/ele12/ballot_questions_12/quest_2.htm

ballotpedia entry

I hope that Massachusetts will vote "Yes" on these questions.

http://boingboing.net/2012/10/28/when-your-news-comes-from-nerd.html

'It takes a while watching TWC before you realize that they are such weather nerds that they sometimes tend to see things from the storm's point of view. They talk about the shape of the storm as beautiful, or "great," or "improving," and what they mean is that the storm is thriving. It's along the lines of, "This storm is looking great. Your lawn furniture? Not so much."'

(also http://boingboing.net/2012/10/28/eastern-us-hunkers-down-for.html )

http://sci-humor.blogspot.com/2012/10/blog-post_24.html

"В начале не было ничего, только полная симметрия, и свободная калибровка летала над водами.

Потом отделил Бог целый спин от полуцелого, и повелел целому спину подчиняться статистике Бозе, а полуцелому статистике Ферми. И увидел он, что это хорошо.[...]"

I've imported it into LJ: http://sci-humour.livejournal.com/

A bit of information related to machine learning and philosophy talks at MIT.

( Read more... )

http://flibusta.net/b/269237

Относительно недавний текст Бориса Акунина под другим псевдонимом: http://flibusta.net/a/38051

Очень понравился...

Is there any good reason to restrict the availability of the Friends page by the most recent two weeks?

Does this restriction really help LJ to scale?

Community-based participatory art, music, and dance. "We encourage you to dress up, bring your own art or games and most importantly, to participate!"

http://boston.figmentproject.org/

http://boston.figmentproject.org/2012/07/2012-music-schedule/

http://en.wikipedia.org/wiki/Figment_(arts_event)

"In Charo City, in a cathedral to Christ Uploaded, which to my surprise he asked for, I married Scile according to Bremen law, in the second degree, registering as a nonconnubial love-match, and I took him to Embassytown."

Profile

anhinga_anhinga

Mishka's Page

July 2021

S	M	T	W	T	F	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Syndicate

Page Summary

OpenAI Codex (статья года), JuliaCon 2021 starts on July 20
9 months since GPT-3 revolution
Julia programming language
2019: shaders, dreamwidth, and more
Self-referential neural nets in 2018
Dataflow matrix machines as a bridge between programs and neural nets
Dataflow matrix machines as generalized recurrent neural networks
Dataflow programming and linear models of computation
(no subject)
A robot with impressive language capabilities
Paul Klee at Boston College till Dec 9
Conferences on General AI: Oxford University, Dec 8-11
Massachusetts elections: Yes on Questions 3 and 2
Storm news from nerds
sci-humor blog
Vapnik@MIT; Cambridge Machine Learning Colloquium
"Там"
Question about Live Journal to software engineers
FIGMENT in Boston (this weekend)
"Embassytown", by China Miéville

Style Credit

Style: Neutral Good for Practicality by timeasmymeasure

Expand Cut Tags

Page generated Jun. 15th, 2025 05:15 am

Anhinga anhinga

Recent Entries

OpenAI Codex (статья года), JuliaCon 2021 starts on July 20

9 months since GPT-3 revolution

Julia programming language

2019: shaders, dreamwidth, and more

Self-referential neural nets in 2018

Dataflow matrix machines as a bridge between programs and neural nets

Dataflow matrix machines as generalized recurrent neural networks

Dataflow programming and linear models of computation

(no subject)

A robot with impressive language capabilities

Paul Klee at Boston College till Dec 9

Conferences on General AI: Oxford University, Dec 8-11

Massachusetts elections: Yes on Questions 3 and 2

Storm news from nerds

sci-humor blog

Vapnik@MIT; Cambridge Machine Learning Colloquium

"Там"

Question about Live Journal to software engineers

FIGMENT in Boston (this weekend)

"Embassytown", by China Miéville

Profile

July 2021

Syndicate

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags