如何解析特定的文本? [英] how to parse for specific text?

查看:124
本文介绍了如何解析特定的文本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想用jsoup只解析中间的文字告诉你有关的称号。

I want to use jsoup just to parse the middle text telling you about the title.

http://www.upcominggames.com/2113/Halo+Combat +演进+周年/
   http://www.upcominggames.com/478/Gears+of+War+ 3 /

你会在jsoup标签是解析这个和刚才解压的文章?

What would the jsoup tags be to parse this and extract just the article?

这将是一个共同的选择上面两篇文章?

What would be a common selector for the two articles above?

编辑:

我想要做的是分析这部分

What i want to do is parse this part


    战争机器3事实结果
    战争机器3是一款第三人称射击游戏由Microsoft发布,由Epic Games开发的,它与NBSP;设置在美国,澳大利亚和欧洲,9月22日以&nbsp于2011年9月20日被释放;日本

Gears of War 3 Facts
Gears of War 3 is a third-person shooter published by Microsoft and developed by Epic Games, and it is set to be released on September 20, 2011 in the US, Australian and Europe and on September 22 in Japan.


    战争3简介结果中的齿轮    这Xbox 360的独家结论战争三部曲的齿轮,战争机器3位球员在 生存,希望和兄弟情谊的一个令人兴奋的经历和故事中间。这第三人称射击游戏 极大地导致通过令人兴奋的世界玩家提供比以往更多的色彩和细节。此外,其精彩的多人模式将带领玩家想要更多,他们已经完成了战役后还是一样。

Gears of War 3 Synopsis
This Xbox 360 exclusive conclusion to the Gears of War trilogy, Gears of War 3 places players in the middle of an exciting experience and story of survival, hope and brotherhood. This third-person shooter dramatically leads players through the exciting world with more color and detail than ever before. Plus, its exciting multiplayer mode will lead players wanting more even after they’ve finished the campaign.


    战争3游戏
结果中的齿轮
    任何人谁打的时候他们玩战争机器3战争游戏的齿轮会感到陌生,但 这并不意味着他们不会面临一些新的惊奇。环境是更 详细和身临其境,增添了兴奋和激动战争专营权的齿轮是著名的 拥有比战争系列的齿轮previous分期付款更多的敌人,战争机器3将 报价玩家一个全新的挑战,因为他们试图挽救从彻底毁灭人类。如果 玩家自己的3D电视,他们将能够发挥3D这个新装置有一个完全身临其境 经验

Gears of War 3 Gameplay
Anyone who has played a Gears of War game will feel familiar when they play Gears of War 3, but that doesn’t meant they won’t be faced with a few new surprised. The environments are much more detailed and immersive, adding to the excitement and thrill the Gears of War franchise is known for. Featuring more enemies than previous installments of the Gears of War series, Gears of War 3 will offer players a brand new challenge as they try to save the human race from complete destruction. If players own a 3D TV, they’ll be able to play this new installment in 3D to have a completely immersive experience.


    战争3多人
结果中的齿轮
    多人增加的战争机器3使游戏从战争2.启动&NBSP机器迈出了一大步;与专用服务器来处理牵线搭桥,Epic Games公司已经投入了大量的精力来使此问题 最好的齿轮体验呢。用Capture领袖,希尔等多人游戏模式,&NBSP之王。玩家将能够把他们的游戏在线与其他玩家在令人兴奋的deathmatches

Gears of War 3 Multiplayer
The multiplayer additions to Gears of War 3 make the game a big step up from Gears of War 2. Starting with dedicated servers to handle matchmaking, Epic Games has put a lot of effort into making this the best Gears experience yet. With Capture the Leader, King of the Hill and other multiplayer modes, players will be able to take their game online against other players in exciting deathmatches.

我要大胆解析为一个单独的TextView,然后在它之下我想要加载其内容。
基本上只要是怎么回事以上。

I want to parse the Bold into a seperate textView and then under it i want to load its content. Basically just how it is above.

如果您高亮文本,然后单击视图选择源你会看到什么,我试图解析

If you hilight the text and click view selection source youll see what i trying to parse

我所熟悉的jsoup。只需要在这部分提供一些帮助。

I am familiar with jsoup. Just need some help on this part.

推荐答案

是的,我明白你在说什么。我认为Jsoup很容易提取这一点,如果你学习网页源$ C ​​$ c和找到共同的链接标签和属性。那些尝试包括:

Yes, I get what you're saying. I think that Jsoup would easily extract this if you study the web page source code and find common linking tags and attributes. Ones to try include:


  • 获得具有标签格
  • 元素
  • 属性ID分配游戏说明

文本从仅仅这两个过滤器返回可能会得到你想要的东西。

The text returned from just these two filters will likely get you what you want.

例如,

编辑:code简化为使用select(...)

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class HaloStuff {
   private static final String TEST_URL_1 = "http://www.upcominggames.com/" +
        "2113/Halo+Combat+Evolved+Anniversary/";
   private static final String DIV_TAG = "div";
   private static final String ID_ATTR = "id";
   private static final String GAME_DESC = "game-desc";

   public static void main(String[] args) {
      Document jsDoc = null;

      List<String> textList = new ArrayList<String>();

      try {
         jsDoc = Jsoup.connect(TEST_URL_1).get();

         Elements textEles = jsDoc.select("div[id=game-desc]");
         for (Element ele : textEles) {
            System.out.println(ele.text());
         }

      } catch (IOException e) {
         e.printStackTrace();
      }
   }
}

这篇关于如何解析特定的文本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆