'Scrapy - getting HTML without outer tag

I'm scraping a page, using Scrapy. I want the HTML contents of the TD with "text" class:

<tr valign="top">
  <td class="text" width="100%">
    <b>A bunch of HTML</b>

    <ul type="disc">
      <li>Some random text</li>
    </ul>
  </td>
</tr>

This is my Scrapy line:

for body in response.css('td.text'):
  yield {'body': body.extract()}

Which works - except it includes the surrounding td:

[
  {"body": "<td class="text" width="100%"> <b>A bunch of HTML</b> <ul type="disc"> <li>Some random text</li> </ul> </td>"}
]

This is what I want:

[
  {"body": "<b>A bunch of HTML</b> <ul type="disc"> <li>Some random text</li> </ul>"}
]

Halp? :)

xpath scrapy

Solution 1:^[1]

Try this selector:

response.css('td.text *')

The * will select all inner tags.

Solution 2:^[2]

Well, I found a solution, although I still think there must be a smarter way:

    bodies = ''
    for body in response.xpath("//td[@class='text']/child::node()"):
        bodies += body.extract()
    yield {'body': bodies}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	DharmanBot
Solution 2	Benjamin Rasmussen

'Scrapy - getting HTML without outer tag

Solution 1:[1]

Solution 2:[2]

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]