You are viewing a single thread.
View all comments View context
0 points

There’s a difference between ‘processing’ the text and ‘parsing’ it. The processing described in the section you posted it fine, and you can manage a similar level of processing on HTML. The tricky/impossible bit is parsing the languages. For instance you can’t write a regex that’ll relibly find the subject, object and verb in any english sentence, and you can’t write a regex that’ll break an HTML document down into a hierarchy of tags as regexs don’t support counting depth of recursion, and HTML is irregular anyway, meaning it can’t be reliably parsed with a regular parser.

permalink
report
parent
reply

For instance you can’t write a regex that’ll relibly find the subject, object and verb in any english sentence

Identifying parts of speech isn’t a requirement of the word parse. That’s the linguistic definition. In computer science identifying tokens is parsing.

https://en.m.wikipedia.org/wiki/Parsing

permalink
report
parent
reply

linuxmemes

!linuxmemes@lemmy.world

Create post

Hint: :q!


Sister communities:

Community rules (click to expand)

1. Follow the site-wide rules
2. Be civil
  • Understand the difference between a joke and an insult.
  • Do not harrass or attack members of the community for any reason.
  • Leave remarks of “peasantry” to the PCMR community. If you dislike an OS/service/application, attack the thing you dislike, not the individuals who use it. Some people may not have a choice.
  • Bigotry will not be tolerated.
  • These rules are somewhat loosened when the subject is a public figure. Still, do not attack their person or incite harrassment.
3. Post Linux-related content
  • Including Unix and BSD.
  • Non-Linux content is acceptable as long as it makes a reference to Linux. For example, the poorly made mockery of sudo in Windows.
  • No porn. Even if you watch it on a Linux machine.
4. No recent reposts
  • Everybody uses Arch btw, can’t quit Vim, and wants to interject for a moment. You can stop now.

Please report posts and comments that break these rules!

Community stats

  • 6.7K

    Monthly active users

  • 1K

    Posts

  • 20K

    Comments