Google on several occasions took moves to make the web less semantic.
They dumped microformats and standards in favor of soupy error tolerant formats that benefitted their search engine and made it harder for other efforts to make information shareable and accessible.
They wanted it to be easy to get information in, but for you to have to go through them to get information out.
riffraff 11 hours ago [-]
> They dumped microformats and standards
I'm not sure they killed microformats, they still support hReview, hProduct etc, don't they?
And they pushed schema.org. I wrote a trivial recipe importing tool that just works™ on a bunch of website because it uses the JSON-LD Recipe schema.
It's ~100 lines and a ton simpler than what I had to write 15 years ago.
Sure, they pushed for HTML5-style stuff, but that's not much of killing things.
IMO it's not google that stopped microformats: it's that website owners realized most of the time it was advantaging third parties for no advantage to them.
mplanchard 1 days ago [-]
I hand-rolled an atom feed for my statically generated blog. It’s a reasonable, easy format to work with.
It was an alternative to RSS from 20 years ago that didn't catch on.
ravenstine 16 hours ago [-]
I thought it did in fact catch on but most people still referred to it as "RSS".
eloisant 11 hours ago [-]
It did catch on, pretty much everything that supported RSS also supported Atom.
It's just that they both fell out of fashion when social media decided they prefer to keep their users captive than accepting interop.
eduction 9 hours ago [-]
I've never seen an Atom formatted podcast. NYTimes and WSJ each have a whole page devoted to their RSS feeds, I've never seen an Atom feed from either of them. It caught on sorta but didn't get the traction of what it was designed to replace. (Not saying this makes it Bad, btw.)
kevincox 8 hours ago [-]
That's a good point. Podcasts are still (almost?) exclusively RSS 2.0. IDK if this is just momentum or Apple rules but I don't think I've ever seen an Atom podcast.
But many podcast clients actually still support Atom (probably using a feed library that supports various formats?) and basically all non-podcast feed readers support Atom.
riffraff 17 hours ago [-]
I think it caught on well enough, platforms such as Wordpress still support it out of the box (I just checked my blog, it works).
I liked Atom's clean design but it felt it was mostly pushed by Google (I may be misremembering) and in the end the syndicated web faded into obscurity anyway.
RSS 2.0 is kinda an unspecified mess, and at least 15 year ago, if you wanted to be compatible with the majority of content you needed some weird heuristics to detect which interpretation of the spec a given feed was using (lol).
And Dave Winer was strongly against ever clarifying the spec, and that’s part of what led to Atom.
talideon 14 hours ago [-]
Not really, and it's still more error-prone than Atom.
There's really no good reason to use anything other than Atom.
echelon 20 hours ago [-]
IIRC, Aaron Swartz was one of the contributors to the format. RIP.
el_io 9 hours ago [-]
I remember him being contributor to the RSS format, why would he also contribute another similar format?
Kind of. It is now really just a caching proxy making it mostly useless.
Although I have found it occasionally useful for sites that have over-active bot-blocking on their feeds because Feedburner is often whitelisted.
intrasight 1 days ago [-]
First iteration of Google's APIs were atom. I do miss XML.
abustamam 23 hours ago [-]
One of the API providers I use at work returns responses in XML and we use an XML parser to parse it to JSON and even then it's not perfect.
What do you like about XML? I feel like I'm missing something.
deaddodo 20 hours ago [-]
The main benefit of XML over JSON is that it is structured, and can be associated with Schema's for built in validation.
Obviously, that's only a benefit if you care about and utilize those features; most teams doing JSON integrations will just build those into the consumer in lieu of them being provided by the transport. But it is something that some people (especially larger enterprise organizations) value.
dolmen 18 hours ago [-]
JSON is structured (not plain text to be analyzed by an IA).
JSON has JSON Schema.
In addition, JSON is easier to parse and to map to common data structures of programming languages.
jeroenhd 12 hours ago [-]
JSON Schema is an unofficial spec with a bunch of competitors and multiple versions, not all of which are compatible. I don't think you can compare it to XML schema validation.
I'm also not so sure about JSON being easier to map to common data structures. The lack of order guarantees within objects makes things like ordered maps quite annoying (you need to either use an array of entries with key and value, or an index within the mapped objects).
deaddodo 9 hours ago [-]
"Structured" in this case refers to being able to be directly mapped to a data structure. Think protobuf and other similar transport mechanisms. The recipient knows what structure to expect because it's not a valid XML document if it's breaking those constraints.
JSON is not, it is closer to the PHP, JS, etc "object" type, which is an ephemeral object with arbitrary member associations.
And, to be clear, this is not a value judgement. They just excel in different fields. XML tends to be easier for strongly and strictly typed languages such as C/C++, C#, Java, etc where you can use the schema to generate your structs automatically. Vanilla JSON is easier for higher level languages that don't require you to manually create a mapping/validation level. JSON Schema tries to bridge that gap to a degree, but isn't built into the standard and isn't even universal.
But, ultimately, both are perfectly sufficient for either use case. It just depends on how much massaging you want to do to make them work.
abustamam 13 hours ago [-]
Thanks, that's interesting to know. Given that we have json schema now though, what reason would someone use XML over Json now?
deaddodo 9 hours ago [-]
JSON Schema is largely an answer to people seeking that type of built-in validation. As I'm not a huge proponent for either (a tool is a tool and you work with it in its ecosystem), I don't have personal feelings on it's adequacy.
But, I would suspect, proponents of XML would still point to it's deeper typing system, document structure (especially the hierarchical features of it), and extremely mature ecosystem + tooling (such as XSLTs) as reasons to prefer it over JSON w/ JSON Schema.
abustamam 7 hours ago [-]
Gotcha! Thanks for the rundown. I started programming at the time when we were transitioning from XMLHTTPRequest to Fetch with json so I know of XML but basically only learned about json.
theshrike79 18 hours ago [-]
XLM had DTDs and Schemas 20 years ago.
JSON is still figuring it out.
thiht 12 hours ago [-]
If XML+DTD was so great, it would still be used.
abustamam 7 hours ago [-]
I don't think that's a fair assessment. Plenty of great technologies died out for many reasons unrelated to its effectiveness.
abustamam 13 hours ago [-]
Json has json schema. What are DTDs?
theshrike79 12 hours ago [-]
JSON Schema has existed for maybe 6 years in theory, in practice a few years.
Basically it tells the system what elements are allowed in which places and what attributes they can contain.
<!ELEMENT html (head, body)>
Defines a html element that can contain a head and body, nothing else. Anything extra or missing will fail the validator.
It was kinda-sorta eventually superseded by XML Schema that could also define what KIND of data the attributes could contain, but did exist at the top of XML/HTML/SGML documents for years.
abustamam 7 hours ago [-]
Ah interesting. Whenever I write an API I'll use Zod and whatever middlewares my framework needs to generate json schema for consumers, and whenever I consume an API I will use Zod to parse.
It would be nice if it were just built into the spec though!
refulgentis 21 hours ago [-]
I don't reach for it often but I've been around the block a bit, CC processors in the iPad point of sale I built circa 2010 used it and it seemed a bit off/unnecessary.
In retrospect, its useful for creating islands of sanity/enforcement in a codebase. Lightweight way to give type annotations across organizational boundaries.
> we use an XML parser to parse it to JSON and even then it's not perfect
I can't quite picture this: how does one parse XML to JSON? I assume there's code that's parsing XML and returning a JSON object? What would make this not perfect, other than a poor implementation of the translator? Would them using JSON help? If JSON is a less expressive format than JSON, is it possible to 100% translate their XML to JSON?
abustamam 20 hours ago [-]
> useful for creating islands of sanity/enforcement in a codebase
Thanks for the insight! Is this what JSDoc/Swagger is now used for?
> I can't quite picture this: how does one parse XML to JSON?
I'm not sure actually. I haven't personally seen the code, I just hear my coworkers always lambasting that API provider for their usage of XML. Maybe it's just their lack of documentation that sucks, but it's become a running joke whenever we get a new partner that the team integrating it jokes that their API is XML.
jeroenhd 12 hours ago [-]
> I just hear my coworkers always lambasting that API provider for their usage of XML
I hear this too, but often when I ask why people say things like that, it's either because XML is "outdated" or because they don't like it.
It's like programs written in C or C++: very few large projects chose those languages nowadays, often for good reasons, so the projects written in those languages are usually 10 to 20 years old. Age comes with a lot of legacy cruft and obscure behaviour, but that's not the fault of those languages per se. Or for people blasting banks for using COBOL, even though COBOL is a perfectly fine high-performance language for the niches bank mainframes serve.
drob518 22 hours ago [-]
Well, that’s a blast from the past.
tkcranny 1 days ago [-]
I’m not clear on the difference between atom and RSS. Atom seemed to be the better spec, but for my Astro blog I ended up sticking to the built in `rss` helper it ships with.
JimDabell 14 hours ago [-]
In the beginning was RSS 0.x. It was originally intended to be based on RDF. Compromises were made and it ended up dropping the RDF. The spec. wasn’t very good and had several ambiguities.
Some people forged ahead with a cleaned up RDF-based version and called it RSS 1.0, while other people went ahead with the ambiguities but without RDF and called it RSS 2.0. The person publishing RSS 2.0 considered it finished and refused to update it. There was drama.
A bunch of people decided that there was too much to clean up from within that mess and started a new format, Atom. This ended up being a much better spec. with an official RFC, but at this point everybody was calling any type of feed “RSS”, even if it was Atom.
If you have the choice, you should pick Atom.
kinow 13 hours ago [-]
I also didn't know much of the difference between the two, and I also used RSS for my Hugo site.
It seems like the last update is from 2008, but the section on the differences has a few interesting items. I am not sure if it changed, but it says:
"The RSS 2.0 specification is copyrighted by Harvard University and is frozen. No significant changes can be made (although the specification is under a Creative Commons licence) and it is intended that future work be done under a different name; Atom is one example of such work."
"Technically, Atom has several advantages: less restrictive licensing, IANA-registered MIME type, XML namespace, URI support, RELAX NG support.[35]"
gabazing 13 hours ago [-]
Same here. Astro has @astrojs/rss package but not atom. It should be an atom option in the same package or needed an @astrojs/atom package.
There is an npm package called astrojs-atom but i am not use it is official or safe.
Is there any astro core developer reading this, please add atom option addition to rss.
perrohunter 1 days ago [-]
what is old is new again?
hnlmorg 1 days ago [-]
No, this is just old.
Pity though. RSS / Atom was a fantastic concept and it’s a real pity big tech killed them off.
rambambram 1 days ago [-]
Nothing is killed. It still exists, it's an open protocol after all. And I choose to use it, it's pretty fun to calmly follow around 2000 feeds from - mostly - blogs from HN. And cars... I need my car blogs.
geodel 1 days ago [-]
Agreed. That nowadays people or even big companies find it outside their core competency to host their blog, have atom/RSS feeds is not because big tech killing it.
darreninthenet 16 hours ago [-]
How do you curate and keep on top of so many feeds? I have ~10 on my RSS reader and I sometimes have trouble keeping up if I have a couple of busy days
Basically, I get to see the latest post from a random feed. Nothing else. No lists of unread new posts from all the feeds. If I like the title and short summary, I click through to the website or blog itself where I can read the whole thing. There's no FOMO this way, or an information overload. Just one post a time.
Because the whole list of feeds is curated by myself, I know that everything is at least a little interesting. I even made a category with Youtube channels that I like, so I can skip their annoying recommended videos algo.
Next to this basic functionality, I made what I call 'Newspapers'. These are certain topics with a bunch of selected feeds attached, they get checked automatically in the background. When the Newspaper has enough articles, I see a new Newspaper appear. Otherwise it might take months before a feed is shown in the random selection.
holistio 20 hours ago [-]
Is there any platform for sharing what feeds we follow? Would love to discover some new blogs.
Well, my guess is that OPML is underrated. And I understand that, because it's so different from the social media that we are used to. On my homepage (link in bio) you can find all the feeds that I follow, available as an OPML file. It might be of interest to you, it might not (probably a lot of blogs you know from here, at least half of my 2000 feeds).
One 'dream' of me is to have OPML be the discovery-glue between all kinds of individual personal websites and blogs. But this requires critical mass to have enough to discover and explore, and it needs some fun/interesting software way to do that.
ushimitsudoki 21 hours ago [-]
[dead]
pletnes 18 hours ago [-]
Lots of sites publish outages, incidents, downtime over RSS/atom. Works great for monitoring, post them into slack with a bot and you can start a discussion thread about that incident where you first hear about it.
bawolff 22 hours ago [-]
Meh. Big tech didnt kill it off, it was already dead at that point. Sometimes things just arent popular no matter how much we might want it to be.
lolive 21 hours ago [-]
Google Reader was uber popular at a time, then Google decided that syndication of articles, with comments, had to be an exclusive feature of their Facebook-esque Google+.
bawolff 9 hours ago [-]
And in this theory, the reason why nobody else made a popular feed reader was?
eduction 9 hours ago [-]
As a digital pedant I am very sympathetic to what prompted the creation of Atom. RSS2 for example under-specifies item "description" and "title," in particular how to put HTML in there, and using the most once-most-common technique (entity escaping HTML) makes it tricky to reliably do more basic things (encode/decode left angle brackets and ampersands, because now you don't know whether to do so singly or doubly).
But the undeniable victory of RSS shows the importance of being first and "easy" (even when "easy" means sweeping edge case problems under the rug). And of humans: Major publishers like the New York Times had adopted RSS and saw no need to switch to Atom because it was good enough. I'd argue the (also underspecified) CSV format is another example of this phenomenon.
(As for the entity escaping dilemma, people mostly just moved to using CDATA for their feed-embeded HTML, although I imagine people who write RSS readers still need to come up with semantics for figuring out if a title or description payload contains encoded html or not.)
bossyTeacher 12 hours ago [-]
So many words to choose and they had to choose one that already has been used before. Why are techies so devoid of imagination?
Now why a spec from 2005 is in the front page of hacker news, I have no idea...
Dec 2005
I think at that time it was still ok?
The hyperscalers stopped that timeline from winning, though.
YouTube had atom feeds and I don't think Amazon and Microsoft have relevant syndication.
Meta is surely responsible but that's it, imo.
If you want all the commits from a repo you can do something like: https://github.com/rust-lang/rust/commits/main/.atom
And if you now only is interested in the num module you can do: https://github.com/rust-lang/rust/commits/main/library/std/s...
They dumped microformats and standards in favor of soupy error tolerant formats that benefitted their search engine and made it harder for other efforts to make information shareable and accessible.
They wanted it to be easy to get information in, but for you to have to go through them to get information out.
I'm not sure they killed microformats, they still support hReview, hProduct etc, don't they?
And they pushed schema.org. I wrote a trivial recipe importing tool that just works™ on a bunch of website because it uses the JSON-LD Recipe schema. It's ~100 lines and a ton simpler than what I had to write 15 years ago.
Sure, they pushed for HTML5-style stuff, but that's not much of killing things.
IMO it's not google that stopped microformats: it's that website owners realized most of the time it was advantaging third parties for no advantage to them.
<https://github.com/w3c/feedvalidator/commit/8ca2fd4e5a520158...>
It's just that they both fell out of fashion when social media decided they prefer to keep their users captive than accepting interop.
But many podcast clients actually still support Atom (probably using a feed library that supports various formats?) and basically all non-podcast feed readers support Atom.
I liked Atom's clean design but it felt it was mostly pushed by Google (I may be misremembering) and in the end the syndicated web faded into obscurity anyway.
And Dave Winer was strongly against ever clarifying the spec, and that’s part of what led to Atom.
There's really no good reason to use anything other than Atom.
Although I have found it occasionally useful for sites that have over-active bot-blocking on their feeds because Feedburner is often whitelisted.
What do you like about XML? I feel like I'm missing something.
Obviously, that's only a benefit if you care about and utilize those features; most teams doing JSON integrations will just build those into the consumer in lieu of them being provided by the transport. But it is something that some people (especially larger enterprise organizations) value.
In addition, JSON is easier to parse and to map to common data structures of programming languages.
I'm also not so sure about JSON being easier to map to common data structures. The lack of order guarantees within objects makes things like ordered maps quite annoying (you need to either use an array of entries with key and value, or an index within the mapped objects).
JSON is not, it is closer to the PHP, JS, etc "object" type, which is an ephemeral object with arbitrary member associations.
And, to be clear, this is not a value judgement. They just excel in different fields. XML tends to be easier for strongly and strictly typed languages such as C/C++, C#, Java, etc where you can use the schema to generate your structs automatically. Vanilla JSON is easier for higher level languages that don't require you to manually create a mapping/validation level. JSON Schema tries to bridge that gap to a degree, but isn't built into the standard and isn't even universal.
But, ultimately, both are perfectly sufficient for either use case. It just depends on how much massaging you want to do to make them work.
But, I would suspect, proponents of XML would still point to it's deeper typing system, document structure (especially the hierarchical features of it), and extremely mature ecosystem + tooling (such as XSLTs) as reasons to prefer it over JSON w/ JSON Schema.
JSON is still figuring it out.
As for DTD: https://en.wikipedia.org/wiki/Document_type_definition
Basically it tells the system what elements are allowed in which places and what attributes they can contain.
Defines a html element that can contain a head and body, nothing else. Anything extra or missing will fail the validator.It was kinda-sorta eventually superseded by XML Schema that could also define what KIND of data the attributes could contain, but did exist at the top of XML/HTML/SGML documents for years.
It would be nice if it were just built into the spec though!
In retrospect, its useful for creating islands of sanity/enforcement in a codebase. Lightweight way to give type annotations across organizational boundaries.
> we use an XML parser to parse it to JSON and even then it's not perfect
I can't quite picture this: how does one parse XML to JSON? I assume there's code that's parsing XML and returning a JSON object? What would make this not perfect, other than a poor implementation of the translator? Would them using JSON help? If JSON is a less expressive format than JSON, is it possible to 100% translate their XML to JSON?
Thanks for the insight! Is this what JSDoc/Swagger is now used for?
> I can't quite picture this: how does one parse XML to JSON?
I'm not sure actually. I haven't personally seen the code, I just hear my coworkers always lambasting that API provider for their usage of XML. Maybe it's just their lack of documentation that sucks, but it's become a running joke whenever we get a new partner that the team integrating it jokes that their API is XML.
I hear this too, but often when I ask why people say things like that, it's either because XML is "outdated" or because they don't like it.
It's like programs written in C or C++: very few large projects chose those languages nowadays, often for good reasons, so the projects written in those languages are usually 10 to 20 years old. Age comes with a lot of legacy cruft and obscure behaviour, but that's not the fault of those languages per se. Or for people blasting banks for using COBOL, even though COBOL is a perfectly fine high-performance language for the niches bank mainframes serve.
Some people forged ahead with a cleaned up RDF-based version and called it RSS 1.0, while other people went ahead with the ambiguities but without RDF and called it RSS 2.0. The person publishing RSS 2.0 considered it finished and refused to update it. There was drama.
A bunch of people decided that there was too much to clean up from within that mess and started a new format, Atom. This ended up being a much better spec. with an official RFC, but at this point everybody was calling any type of feed “RSS”, even if it was Atom.
If you have the choice, you should pick Atom.
At the bottom of the article there's, under "See Also", a link to this page comparing RSS and Atom: https://www.intertwingly.net/wiki/pie/Rss20AndAtom10Compared...
It seems like the last update is from 2008, but the section on the differences has a few interesting items. I am not sure if it changed, but it says:
"The RSS 2.0 specification is copyrighted by Harvard University and is frozen. No significant changes can be made (although the specification is under a Creative Commons licence) and it is intended that future work be done under a different name; Atom is one example of such work."
The Wikipedia RSS page has also a small section comparing RSS and Atom: https://en.wikipedia.org/wiki/RSS#RSS_compared_with_Atom
"Technically, Atom has several advantages: less restrictive licensing, IANA-registered MIME type, XML namespace, URI support, RELAX NG support.[35]"
There is an npm package called astrojs-atom but i am not use it is official or safe.
Is there any astro core developer reading this, please add atom option addition to rss.
Pity though. RSS / Atom was a fantastic concept and it’s a real pity big tech killed them off.
Basically, I get to see the latest post from a random feed. Nothing else. No lists of unread new posts from all the feeds. If I like the title and short summary, I click through to the website or blog itself where I can read the whole thing. There's no FOMO this way, or an information overload. Just one post a time.
Because the whole list of feeds is curated by myself, I know that everything is at least a little interesting. I even made a category with Youtube channels that I like, so I can skip their annoying recommended videos algo.
Next to this basic functionality, I made what I call 'Newspapers'. These are certain topics with a bunch of selected feeds attached, they get checked automatically in the background. When the Newspaper has enough articles, I see a new Newspaper appear. Otherwise it might take months before a feed is shown in the random selection.
Or you create a blog for yourself and you make a blogroll.
As for discovering new blogs, couple of options but there are more out there: https://ooh.directory, https://blogroll.org/
One 'dream' of me is to have OPML be the discovery-glue between all kinds of individual personal websites and blogs. But this requires critical mass to have enough to discover and explore, and it needs some fun/interesting software way to do that.
But the undeniable victory of RSS shows the importance of being first and "easy" (even when "easy" means sweeping edge case problems under the rug). And of humans: Major publishers like the New York Times had adopted RSS and saw no need to switch to Atom because it was good enough. I'd argue the (also underspecified) CSV format is another example of this phenomenon.
(As for the entity escaping dilemma, people mostly just moved to using CDATA for their feed-embeded HTML, although I imagine people who write RSS readers still need to come up with semantics for figuring out if a title or description payload contains encoded html or not.)