'How to reference a code block points in org mode?

I want to use libxml-parse-html-region to test some HTML extraction functions. But it only accepts region points as the input parameters. Does Org mode provide a way to reference another code block as a region?

E.g.

#+Name: html-test
#+BEGIN_EXAMPLE
<html>
<body>test</body>
</html>
#+END_EXAMPLE

#+BEGIN_SRC elisp
(libxml-parse-html-region html-test-start-point html-test-end-point)
#+END_SRC

So, how can I set the html-test-start-point and html-test-end-point?



Solution 1:[1]

The org-element library (the official parser for Org mode) provides the capability to get the body of the block as a string. You can then insert that string into a temporary buffer, where the region specified by (point-min) and (point-max) contains exactly the body of the block. You can find developer documentation for the library on Worg.

Since you don't specify the interface exactly, all I can do here is give some guidelines. The basic idea is to set point to the beginning of the example block somehow (maybe by searching for the name html-test and then moving to the beginning of the line). Once there, you call org-element-at-point to parse the element (and maybe sanity-check that you are in an example block). From the parse tree, extract the value of the :value property which is the body of the block as a string. Create a temporary buffer, insert the string and then execute the libxml function in the context of the temporary buffer.

Here is some illustrative code, commented fairly liberally and I hope in an enlightening manner (for the mysterious (goto-char 130), just keep reading - it's explained in the complete Org mode file below):

  (save-excursion
    ;; this should probably be replaced by a search for the correct place
    (goto-char 130)
    ;; parse the element at point
    (setq parsed (org-element-at-point))
    ;; check that it is an example block
    (unless (eq (org-element-type parsed) 'example-block)
      (error "Not an example block"))
    ;; retrieve the :value property, i.e, the body of the block
    (setq value (org-element-property :value parsed))
    ;; create a temp buffer
    (with-temp-buffer
      ;; in the context of the temp buffer, insert the body of the block
      (insert value)
      ;; do something with the region containing the body by using (point-min)
      ;; as the beginning of the region and (point-max) as the end
      (message (buffer-substring (point-min) (point-max)))
      (libxml-parse-html-region (point-min) (point-max))))

and here's the same code as a code block in an Org mode file that contains an example block that the code block can operate on. Cut-and-paste the whole thing into a foo.org file, open it in emacs and C-c C-c on the code block to see it in action:

* foo

We have a named block and we want to get the beginning and the end of the region
that consists of the body of the block.

#+Name: html-test
#+BEGIN_EXAMPLE
<html>
<body>test</body>
</html>
#+END_EXAMPLE

The position of the beginning of the block is 130 - you can check by evaluating
this expression with C-x C-e after the closing paren:

(goto-char 130)

You should end up at the beginning of the `#+name:` line.

If we imagine a code block like this:

#+BEGIN_SRC elisp
(libxml-parse-html-region html-test-start-point html-test-end-point)
#+END_SRC

the question is how to calculate the html-test-{start,end}-point values.

The suggested solution is to copy the body of the block to a temporary
buffer and do the evaluation of the XML function in that buffer. The region
is then delimited by (point-min) and (point-max).

* Code                                                                                                        :noexport:

#+begin_src elisp :results drawer
  (save-excursion
    ;; this should probably be replaced by a search for the correct place
    (goto-char 130)
    ;; parse the element at point
    (setq parsed (org-element-at-point))
    ;; check that it is an example block
    (unless (eq (org-element-type parsed) 'example-block)
      (error "Not an example block"))
    ;; retrieve the :value property, i.e, the body of the block
    (setq value (org-element-property :value parsed))
    ;; create a temp buffer
    (with-temp-buffer
      ;; in the context of the temp buffer, insert the body of the block
      (insert value)
      ;; do something with the region containing the body by using (point-min)
      ;; as the beginning of the region and (point-max) as the end
      (message (buffer-substring (point-min) (point-max)))
      (libxml-parse-html-region (point-min) (point-max))))
#+end_src

As the OP points out in a comment and @TianshuWang describes in his answer, you can add a variable to the source code block and get the body of the example block directly:

#+begin_src elisp :results drawer :var value=html-test

    ;; create a temp buffer
    (with-temp-buffer
      ;; in the context of the temp buffer, insert the body of the block
      (insert value)
      ;; do something with the region containing the body by using (point-min)
      ;; as the beginning of the region and (point-max) as the end
      (message (buffer-substring (point-min) (point-max)))
      (libxml-parse-html-region (point-min) (point-max))))
#+end_src

But that bypasses the org-element library: what fun is that? :-)

Solution 2:[2]

After consulting the manual, I suspect that there is no simple way to achieve this.

You can use other block as argument like https://orgmode.org/worg/org-contrib/babel/intro.html#arguments-to-source-code-blocks, however there is not a function of libxml which accept string.

#+name: html-test
#+begin_example
<html>
<body>test</body>
</html>
#+end_example

#+begin_src elisp :var html=html-test 
(message html)
#+end_src

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Tianshu Wang